Reserved general punctuation

Kenneth Whistler kenw at sybase.com
Fri May 2 19:47:18 CEST 2008


Patrik asked:

> Well, I want all unassigned codepoints. I have not understood  
> noncharacters is the same thing as unassigned...
> 
> Is it?

No, they aren't the same.

The best summary of the relationship between various
major types of code points, including these, can be
found in Table 2-3 "Types of Code Points" on p. 27
of the Unicode 5.0 text:

http://www.unicode.org/versions/Unicode5.0.0/ch02.pdf

If you scan down that and consider the categories in terms
of the IDNA table requirements:

Reserved code points --> UNASSIGNED

Control, Private-use, Surrogate, and Noncharacter --> DISALLOWED

Format --> DISALLOWED (*except* for the two specified to be CONTEXTJ)

And then everything else is in the Graphic character class,
and that is where we have to make distinctions among the
letters, digits, and marks that mostly are PVALID and
the symbols and punctuation that are mostly DISALLOWED.

--Ken



More information about the Idna-update mailing list