How to know what codepoints are unassigned
Frank Ellermann
hmdmhdfmhdjmzdtjmzdtzktdkztdjz at gmail.com
Sun May 4 02:35:39 CEST 2008
John C Klensin wrote:
> Non-character code points that have specific
> non-characters assigned to them are DISALLOWED
> (unless they are exceptions), but by other rules.
I'm not sure what you are up to, but if it's about
the 2048 surrogates and the 66 non-characters you
can simply hardwire them, AFAIK they are not going
to change, no additions, no substractions, forever.
RFC 3987 covers these concepts in <ucschar>. The
non-characters consists of two ??FFFE + ??FFFF per
plane and 32 u+FDD0 .. u+FDEF for 66 = 2 * 17 + 32.
(The 32 are nice to swap out C0 or C1 temporarily).
Frank
More information about the Idna-update
mailing list