Comments on the Unicode Codepoints and IDNA Internet-Draft

Martin Duerst duerst at
Tue Jul 29 08:14:31 CEST 2008

At 04:13 08/07/29, Kenneth Whistler wrote:

>In addition, making the conjoining jamos DISALLOWED in
>the table could lead to unexpected behavior for normalization
>of Korean data in the context of an IDN protocol implementation.
>What Korea is attempting here is to restrict the allowed
>repertoire of characters for registration to U+AC00..U+D7A3,
>which is fine. But such characters might also be
>represented on the wire or in other data sources in
>terms of sequences of conjoining jamos. The requirement
>in the protocol to normalize to NFKC will result in
>the correct Korean Hangul syllables in the range
>U+AC00..U+D7A3 being passed to the registry for lookup,
>but restricting the conjoining jamos in the IDNA protocol
>table itself is not a good idea.

I agree with Mark and Ken in general, but I think the above two
paragraphs are wrong. In IDNA 2008, the only thing we ever
talk about is the normalized stuff. Ideally, things would be
normalized at source. If not, there is (different to IDNA 2003)
no guarantee that or how things will be normalized. Non-normalized
character sequences are totally outside the protocol in IDNA 2008
(at least as currently proposed).

Regards,    Martin.

#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#       mailto:duerst at     

More information about the Idna-update mailing list