How to know what codepoints are unassigned

Frank Ellermann hmdmhdfmhdjmzdtjmzdtzktdkztdjz at gmail.com
Sun May 4 02:35:39 CEST 2008


John C Klensin wrote:
 
> Non-character code points that have specific
> non-characters assigned to them are DISALLOWED
> (unless they are exceptions), but by other rules.

I'm not sure what you are up to, but if it's about
the 2048 surrogates and the 66 non-characters you
can simply hardwire them, AFAIK they are not going
to change, no additions, no substractions, forever.

RFC 3987 covers these concepts in <ucschar>.  The
non-characters consists of two ??FFFE + ??FFFF per
plane and 32 u+FDD0 .. u+FDEF for 66 = 2 * 17 + 32.
(The 32 are nice to swap out C0 or C1 temporarily).

 Frank



More information about the Idna-update mailing list