The Unicode Standard Character Properties
A character property is a named attribute of an entity in the Unicode Standard, associated with a defined set of values. The Standard specifies many different types of character properties some of whose interpretation (such as the case of a character) is independent of context, whereas the interpretation of other properties (such as directionality) is applicable to a character sequence as a whole, rather than to the individual characters that compose the sequence. As an example, a code point property refers to the inherent attributes of code points irrespective of any particular encoded character; an abstract character property, on the other hand, refers to attributes of abstract characters per se, based on their independent existence as elements of writing systems or other notational systems, irrespective of their encoding in the Unicode Standard.
As for encoded character properties, for each, there is a mapping from every character code point to some value in the set of values associated with that property. They are defined this way to facilitate the implementation of character property APIs based on the Unicode Character Database. Typically, an API will take a property and a code point as input, and will return a value for that property as output, rendering it as the “character property” for the “character” encoded at that code point. In some cases, an encoded character property is exactly equivalent to a code point property. In others, it reflects an abstract character property, but extends the scope of the property to include all code points, including unassigned code points. Still in many instances, it is semantically complex and may telescope together values associated with a number of abstract character properties and/or code point properties.
List of Unicode character properties
In Unicode, the terms “Unicode character property,” “character property,” and “property”—that is, without qualifier—refer to an encoded character property, unless otherwise indicated. The table below presents the list of encoded character properties formally considered to be a part of the latest version of the Unicode Standard. The list of the values associated with each property (where the “type” is indicated as “Enum”) can be found on their respective linked pages.
NOTE: Numeric properties (properties whose values are numbers that can take on any integer or real values) and string-valued properties (those whose values are strings) are indicated in the table above as “Scalar” types. All other official property value types as designated in the Unicode Standard, including enumerated, closed enumeration, boolean, and catalog properties, are marked “enumerated” to facilitate a more accessible browsing of the constituent characters.