Reserved general punctuation

Kenneth Whistler kenw at
Thu May 1 20:41:09 CEST 2008

Patrik asked:
> > In Unicode, what we've been referring to as "unassigned" (more  
> > precisely
> > gc=Cn) means that a code point (from 0 to 10FFFF) is not assigned  
> > **to a
> > character**.
> In what file of the Unicode distribution can I find every codepoint  
> that have gc=Cn?

Right at the top of that file, in fact.

Also if you examine the listing carefully, you will see that
while most of the gc=Cn characters are "reserved", all
of the noncharacters are also among the list. For example:

FFEF..FFF8  ; Cn #  [10] <reserved-FFEF>..<reserved-FFF8>


FFFE..FFFF  ; Cn #   [2] <noncharacter-FFFE>..<noncharacter-FFFF>

The place to get the *concise* listing of all the noncharacters

and search down for "Noncharacter_Code_Point".

> Is that the same as the codepoints that are missing from  
> UnicodeData.txt? (I know about the "first", "last" issues...)

Correct. No gc=Cn code points are listed in UnicodeData.txt.


More information about the Idna-update mailing list