Reserved general punctuation
patrik at frobbit.se
Sat May 3 04:40:50 CEST 2008
On 1 maj 2008, at 20.41, Kenneth Whistler wrote:
> Also if you examine the listing carefully, you will see that
> while most of the gc=Cn characters are "reserved", all
> of the noncharacters are also among the list.
Yes, I saw that, and unfortunately that make this list in
DerivedGeneralCategory.txt from my point of view not correct. If the
table list "unassigned" (the header say
"General_Category=Unassigned"), then those codepoints (for example U+
FFFE) should not be there.
You pointed at Table 2-3 "Types of Code Points" on p. 27 of the
Unicode 5.0 text (http://www.unicode.org/versions/Unicode5.0.0/
ch02.pdf) that clearly show that U+FFFE is not an unassigned
codepoint, but is gc=Cn.
So, the table in DerivedGeneralCategory.txt show gc=Cn which according
to the types of codepoints table is a larger set of codepoints than
Now I think I have this under control :-)
Thanks to you Ken! Thanks!
Expect a revised version of the tables document.
More information about the Idna-update