Unicode Character Database

The Unicode Standard Characters Repertoire

Twitter · Facebook
Unicode Standard Character DatabaseUnicode Standard Character Database

The Unicode Standard specifies a numeric value (also known as code point) and a name for each of its characters. In this respect, it is similar to other character encoding standards from ASCII onward. In addition to character codes and names, other information is crucial to ensure legible text: a character’s case, directionality, and alphabetic properties must also be well defined. The Unicode Standard defines these and other semantic values, and includes application data such as case mapping tables and character property tables as part of the Unicode Character Database. Character properties define a character’s identity and behavior; they ensure consistency in the processing and interchange of Unicode data. (See the section Unicode Character Properties.)

The Unicode Standard contains 1,114,112 code points, most of which are available for encoding of characters. The majority of the common characters used in the major languages of the world are encoded in the first 65,536 code points, known as the Basic Multilingual Plane (BMP). The overall capacity for a little over a million characters is more than sufficient for all currently known character encoding requirements, including full coverage of all minority and historic scripts of the world. Unicode characters are represented in one of three encoding forms: a 32-bit form (UTF- 32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been designed for ease of use with existing ASCII-based systems. The Unicode Standard is code-for-code identical with International Standard ISO/IEC 10646. Any implementation that is conformant to Unicode is therefore conformant to ISO/ IEC 10646.

The latest Unicode Standard, that is, Version 12.0, contains a total of 137,929 characters from the world’s scripts. These characters are ample for communication for all modern languages as well as representing the classical forms of many languages. The Standard encompasses the European alphabetic scripts, Middle Eastern right-to-left scripts, and other regional scripts such as those of Asia and Africa. Likewise, many archaic and historic scripts are encoded. The Han script includes 87,887 unified ideographic characters defined by national, international, and industry standards of China, Japan, Korea, Taiwan, Vietnam, and Singapore. Additionally, the Standard contains many important symbol sets, including currency symbols, punctuation marks, mathematical symbols, technical symbols, geometric shapes, dingbats, and emojis.

List of Unicode characters

In Unicode, the range of integers used to code characters is called the codespace. A particular integer in this set is called a code point. When an abstract character is assigned to a given code point in the codespace, it is then referred to as an encoded character. The Unicode codespace consists of the integers from 0 to 10FFFF, comprising 1,114,112 code points available for mapping per the repertoire of abstract characters. The table below presents an ordered list of all the code points defined in the current repertoire of the Unicode Standard.

Unicode CharactersPage 2 of 4352
Ā
U+0100
ā
U+0101
Ă
U+0102
ă
U+0103
Ą
U+0104
ą
U+0105
Ć
U+0106
ć
U+0107
Ĉ
U+0108
ĉ
U+0109
Ċ
U+010A
ċ
U+010B
Č
U+010C
č
U+010D
Ď
U+010E
ď
U+010F
Đ
U+0110
đ
U+0111
Ē
U+0112
ē
U+0113
Ĕ
U+0114
ĕ
U+0115
Ė
U+0116
ė
U+0117
Ę
U+0118
ę
U+0119
Ě
U+011A
ě
U+011B
Ĝ
U+011C
ĝ
U+011D
Ğ
U+011E
ğ
U+011F
Ġ
U+0120
ġ
U+0121
Ģ
U+0122
ģ
U+0123
Ĥ
U+0124
ĥ
U+0125
Ħ
U+0126
ħ
U+0127
Ĩ
U+0128
ĩ
U+0129
Ī
U+012A
ī
U+012B
Ĭ
U+012C
ĭ
U+012D
Į
U+012E
į
U+012F
İ
U+0130
ı
U+0131
IJ
U+0132
ij
U+0133
Ĵ
U+0134
ĵ
U+0135
Ķ
U+0136
ķ
U+0137
ĸ
U+0138
Ĺ
U+0139
ĺ
U+013A
Ļ
U+013B
ļ
U+013C
Ľ
U+013D
ľ
U+013E
Ŀ
U+013F
ŀ
U+0140
Ł
U+0141
ł
U+0142
Ń
U+0143
ń
U+0144
Ņ
U+0145
ņ
U+0146
Ň
U+0147
ň
U+0148
ʼn
U+0149
Ŋ
U+014A
ŋ
U+014B
Ō
U+014C
ō
U+014D
Ŏ
U+014E
ŏ
U+014F
Ő
U+0150
ő
U+0151
Œ
U+0152
œ
U+0153
Ŕ
U+0154
ŕ
U+0155
Ŗ
U+0156
ŗ
U+0157
Ř
U+0158
ř
U+0159
Ś
U+015A
ś
U+015B
Ŝ
U+015C
ŝ
U+015D
Ş
U+015E
ş
U+015F
Š
U+0160
š
U+0161
Ţ
U+0162
ţ
U+0163
Ť
U+0164
ť
U+0165
Ŧ
U+0166
ŧ
U+0167
Ũ
U+0168
ũ
U+0169
Ū
U+016A
ū
U+016B
Ŭ
U+016C
ŭ
U+016D
Ů
U+016E
ů
U+016F
Ű
U+0170
ű
U+0171
Ų
U+0172
ų
U+0173
Ŵ
U+0174
ŵ
U+0175
Ŷ
U+0176
ŷ
U+0177
Ÿ
U+0178
Ź
U+0179
ź
U+017A
Ż
U+017B
ż
U+017C
Ž
U+017D
ž
U+017E
ſ
U+017F
ƀ
U+0180
Ɓ
U+0181
Ƃ
U+0182
ƃ
U+0183
Ƅ
U+0184
ƅ
U+0185
Ɔ
U+0186
Ƈ
U+0187
ƈ
U+0188
Ɖ
U+0189
Ɗ
U+018A
Ƌ
U+018B
ƌ
U+018C
ƍ
U+018D
Ǝ
U+018E
Ə
U+018F
Ɛ
U+0190
Ƒ
U+0191
ƒ
U+0192
Ɠ
U+0193
Ɣ
U+0194
ƕ
U+0195
Ɩ
U+0196
Ɨ
U+0197
Ƙ
U+0198
ƙ
U+0199
ƚ
U+019A
ƛ
U+019B
Ɯ
U+019C
Ɲ
U+019D
ƞ
U+019E
Ɵ
U+019F
Ơ
U+01A0
ơ
U+01A1
Ƣ
U+01A2
ƣ
U+01A3
Ƥ
U+01A4
ƥ
U+01A5
Ʀ
U+01A6
Ƨ
U+01A7
ƨ
U+01A8
Ʃ
U+01A9
ƪ
U+01AA
ƫ
U+01AB
Ƭ
U+01AC
ƭ
U+01AD
Ʈ
U+01AE
Ư
U+01AF
ư
U+01B0
Ʊ
U+01B1
Ʋ
U+01B2
Ƴ
U+01B3
ƴ
U+01B4
Ƶ
U+01B5
ƶ
U+01B6
Ʒ
U+01B7
Ƹ
U+01B8
ƹ
U+01B9
ƺ
U+01BA
ƻ
U+01BB
Ƽ
U+01BC
ƽ
U+01BD
ƾ
U+01BE
ƿ
U+01BF
ǀ
U+01C0
ǁ
U+01C1
ǂ
U+01C2
ǃ
U+01C3
DŽ
U+01C4
Dž
U+01C5
dž
U+01C6
LJ
U+01C7
Lj
U+01C8
lj
U+01C9
NJ
U+01CA
Nj
U+01CB
nj
U+01CC
Ǎ
U+01CD
ǎ
U+01CE
Ǐ
U+01CF
ǐ
U+01D0
Ǒ
U+01D1
ǒ
U+01D2
Ǔ
U+01D3
ǔ
U+01D4
Ǖ
U+01D5
ǖ
U+01D6
Ǘ
U+01D7
ǘ
U+01D8
Ǚ
U+01D9
ǚ
U+01DA
Ǜ
U+01DB
ǜ
U+01DC
ǝ
U+01DD
Ǟ
U+01DE
ǟ
U+01DF
Ǡ
U+01E0
ǡ
U+01E1
Ǣ
U+01E2
ǣ
U+01E3
Ǥ
U+01E4
ǥ
U+01E5
Ǧ
U+01E6
ǧ
U+01E7
Ǩ
U+01E8
ǩ
U+01E9
Ǫ
U+01EA
ǫ
U+01EB
Ǭ
U+01EC
ǭ
U+01ED
Ǯ
U+01EE
ǯ
U+01EF
ǰ
U+01F0
DZ
U+01F1
Dz
U+01F2
dz
U+01F3
Ǵ
U+01F4
ǵ
U+01F5
Ƕ
U+01F6
Ƿ
U+01F7
Ǹ
U+01F8
ǹ
U+01F9
Ǻ
U+01FA
ǻ
U+01FB
Ǽ
U+01FC
ǽ
U+01FD
Ǿ
U+01FE
ǿ
U+01FF

NOTE: The Unicode Standard does not encode idiosyncratic, novel, or private-use characters, nor does it encode logos or graphics. Graphologies unrelated to text, such as dance notations, are likewise outside the scope of Unicode. Font variants are explicitly not encoded. The Standard reserves 6,400 code points in the BMP for private use, which may be used to assign codes to characters not included in the Unicode repertoire. Another 131,068 private-use code points are available outside the BMP, should 6,400 prove insufficient for particular applications.

Comments

  1. garygary
    Mar 12, 2024 16:52 GMT

    ᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼

    1. ᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼
      Mar 12, 2024 16:52 GMT

      ᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼

  2. ..
    Feb 28, 2024 00:40 GMT

    This doesn't work theres a white rectangle instead, but it doesn't matter bc any invisible character that does work, just makes it show your default name

  3. king-king-
    Feb 25, 2024 07:53 GMT

    This character is used for scamming little kids in discord

  4. SparkySparky
    Feb 24, 2024 18:58 GMT

    What is the most powerful thing you have?

  5. HHEHHE
    Feb 22, 2024 16:22 GMT

    ᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼ there u go guys thnak me later :)

  6. SansSans
    Feb 19, 2024 00:27 GMT

    hello im sans, from undertale, if you dont believe me look: eeeeeeeeeeeeeeeeeeeeeeee

  7. ᲼
    Feb 3, 2024 01:28 GMT

    ᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼

  8. fkfk
    Feb 1, 2024 03:22 GMT

    BEEP: Comment is too short :)

  9. connorconnor
    Jan 25, 2024 20:06 GMT

    same brooooo :/

  10. SwmnSwmn
    Aug 23, 2023 04:40 GMT

    Hello, I am myswlf

NOTE: You are replying to 's comment. [Cancel]