Unicode Character Database

The Unicode Standard Character Database

Twitter · Facebook
Unicode Standard Character DatabaseUnicode Standard Character Database

The Unicode Standard (commonly known as simply Unicode) is a universal character encoding standard for written characters and text. It defines a consistent way of encoding multilingual text that enables the representation of worldwide text for computer processing and display of written texts of classical and modern languages, as well as many technical disciplines of the world. As the default encoding of HTML and XML, the Unicode Standard provides the pillar for the World Wide Web and the global business ecosystem of the current age. Required in new Internet protocols and implemented in all modern operating systems and programming languages, Unicode is the basis of software that must function all around the world. With Unicode, the technology industry has replaced proliferating character sets with a single, stable, and universal character repertoire that allows for global interoperability and reliable cross-language data interchange.

From a software developer's point of view, the Unicode Standard and its associated specifications provide programmers with a unified universal character encoding, extensive descriptions, and vast amounts of data about how characters in the Unicode repertoire function. The specifications describe how to form words and break lines; sort text in different languages; format numbers, dates, and times appropriate to certain languages; display languages whose written form flows from right to left, such as Arabic, Hebrew, and Thaana; or whose written form splits, combines, and reorders, such as languages of South Asia. Without the character properties and algorithms in the Unicode Standard and its associated core specifications, interoperability between different implementations would be impossible, and much of the vast breadth of the world’s languages would lie outside the reach of modern computer software.

The Unicode Standard associates a rich set of semantics with each encoded character: properties that are required for interoperability and correct behavior in implementations, as well as for Unicode conformance. These semantics are comprehensively cataloged in what is known as the Unicode Character Database, a collection of data files which contain the Unicode character code points and character names. The data files define character properties and mappings between Unicode characters (such as case mappings). The Unicode Character Database, being an integral part of the Unicode Standard, contains normative property and mapping information required for implementation of Unicode Standard algorithms such as the Bidirectional, Line Breaking, Normalization, Word Boundary Determination, and Casefolding algorithms. The data files also contain additional informative and provisional character property information.

Comments

  1. ᲼
    Nov 20, 2022 23:57 GMT

    ᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼

  2. ᲼᲼᲼᲼
    Oct 23, 2022 12:12 GMT

    ᲼᲼᲼᲼᲼᲼

  3. ᲼
    Oct 11, 2022 17:53 GMT

    ᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼

  4. redactedredacted
    Apr 24, 2022 09:45 GMT

    I was fed binary in my dream and I was told to type it into my phone in my dream, when I woke up it was in my phone and directly translated to ༡ and after investigating on my phone it has now stopped working completely.

    1. wowwow
      May 10, 2022 23:58 GMT

      bro ngl this scares me

  5. SUPERWINDOWS79SUPERWINDOWS79
    Feb 15, 2022 18:58 GMT

    U+03A2 GREEK CAPAITIAL LETTER FINAL SIGMA

  6. ᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼
    Feb 11, 2022 05:53 GMT

    ᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼

  7. ᲼᲼᲼᲼
    Jan 5, 2022 00:20 GMT

    ᲼᲼᲼᲼᲼᲼᲼᲼

    1. LolLol
      Jan 31, 2022 00:19 GMT

      yessssssssssssssssssssssssssssssssssss
      hih

  8. ᲼
    Dec 14, 2020 06:39 GMT

    ᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼᲼

  9. EEGEEG
    Nov 5, 2020 12:48 GMT

    ⠀ ⠀ ⠀ ⠀ ⠀ ⠀ ⠀ ⠀ ⠀ ⠀ ⠀ ⠀ ⠀ ⠀ ⠀ ⠀ ⠀ ⠀ ⠀ ⠀EEG

  10. grpgrp
    Sep 17, 2020 17:19 GMT

    supposedly it is "hoax" for wikidiots, this REAL yot letter...

NOTE: You are replying to 's comment. [Cancel]