Kenneth Whistler kenw at
Fri May 2 19:47:18 CEST 2008

Patrik asked:

> Well, I want all unassigned codepoints. I have not understood  
> noncharacters is the same thing as unassigned...
> Is it?

No, they aren't the same.

The best summary of the relationship between various
major types of code points, including these, can be
found in Table 2-3 "Types of Code Points" on p. 27
of the Unicode 5.0 text:

If you scan down that and consider the categories in terms
of the IDNA table requirements:

Reserved code points --> UNASSIGNED

Control, Private-use, Surrogate, and Noncharacter --> DISALLOWED

Format --> DISALLOWED (*except* for the two specified to be CONTEXTJ)

And then everything else is in the Graphic character class,
and that is where we have to make distinctions among the
letters, digits, and marks that mostly are PVALID and
the symbols and punctuation that are mostly DISALLOWED.


