Reserved general punctuation
Kenneth Whistler
kenw at sybase.com
Fri May 2 19:47:18 CEST 2008
Patrik asked:
> Well, I want all unassigned codepoints. I have not understood
> noncharacters is the same thing as unassigned...
>
> Is it?
No, they aren't the same.
The best summary of the relationship between various
major types of code points, including these, can be
found in Table 2-3 "Types of Code Points" on p. 27
of the Unicode 5.0 text:
http://www.unicode.org/versions/Unicode5.0.0/ch02.pdf
If you scan down that and consider the categories in terms
of the IDNA table requirements:
Reserved code points --> UNASSIGNED
Control, Private-use, Surrogate, and Noncharacter --> DISALLOWED
Format --> DISALLOWED (*except* for the two specified to be CONTEXTJ)
And then everything else is in the Graphic character class,
and that is where we have to make distinctions among the
letters, digits, and marks that mostly are PVALID and
the symbols and punctuation that are mostly DISALLOWED.
--Ken
More information about the Idna-update
mailing list