Reserved general punctuation

Patrik Fältström patrik at
Sat May 3 04:40:50 CEST 2008

On 1 maj 2008, at 20.41, Kenneth Whistler wrote:

> Also if you examine the listing carefully, you will see that
> while most of the gc=Cn characters are "reserved", all
> of the noncharacters are also among the list.

Yes, I saw that, and unfortunately that make this list in  
DerivedGeneralCategory.txt from my point of view not correct. If the  
table list "unassigned" (the header say  
"General_Category=Unassigned"), then those codepoints (for example U+  
FFFE) should not be there.

You pointed at Table 2-3 "Types of Code Points" on p. 27 of the  
Unicode 5.0 text ( 
ch02.pdf) that clearly show that U+FFFE is not an unassigned  
codepoint, but is gc=Cn.

So, the table in DerivedGeneralCategory.txt show gc=Cn which according  
to the types of codepoints table is a larger set of codepoints than  

Now I think I have this under control :-)

Thanks to you Ken! Thanks!

Expect a revised version of the tables document.


More information about the Idna-update mailing list