Comments on the Unicode Codepoints and IDNA Internet-Draft

Kenneth Whistler kenw at sybase.com
Mon Jul 28 21:13:51 CEST 2008


Patrik,

> Thank you very much for this. I will have a look at this and come back  
> with what I think the suggested changes to the -tables document are.

I've read through the comments from NIDA, and I concur
with Mark's assessments on this.

1. Hangul Jamo

The NIDA document requests that the conjoining jamos,
U+1100..U+1159, etc., be changed from PVALID to DISALLOWED
in the table, and notes that "we do not allow Old Hangul
letters for Korean IDN."

This kind of restriction should be a matter of registry
policy, but not be something built into the protocol
and the table it uses.

In addition, making the conjoining jamos DISALLOWED in
the table could lead to unexpected behavior for normalization
of Korean data in the context of an IDN protocol implementation.

What Korea is attempting here is to restrict the allowed
repertoire of characters for registration to U+AC00..U+D7A3,
which is fine. But such characters might also be
represented on the wire or in other data sources in
terms of sequences of conjoining jamos. The requirement
in the protocol to normalize to NFKC will result in
the correct Korean Hangul syllables in the range
U+AC00..U+D7A3 being passed to the registry for lookup,
but restricting the conjoining jamos in the IDNA protocol
table itself is not a good idea.

2. Bangjeom.

This requests changing U+302E..U+302F from PVALID to
DISALLOWED. Again, there is no need to do so ... this is
simply a matter of the Korean registry disallowing
these characters for registration, rather than needing
to change the IDNA table to special case two combining
marks not needed for modern Hangul.

Sections 3 through 6 are no-ops, because for those
the NIDA recommendations already match the algorithm
and table of the table document.

--Ken



More information about the Idna-update mailing list