[Almost OT] Re: Hangul jamo issues - are jamo sequences legitimate?

Michel Suignard michelsu at windows.microsoft.com
Wed Jan 10 19:22:07 CET 2007


> Moreover, U+31xx compat jamo letters are the only input method
> for jamo chars under NFC and KSC5601. We have no direct input
> method for U+11xx,which is not in KSC5601->UNICODE table.
>
> So, U+31xx and U+11xx both should be allowed in labels.
>
You could possibly argue for having them as input, but because they get
normalized into Hangul syllables by NFKC (except for rare old hangul
syllables which can only be represented by Jamo or a mix of jamo and
modern hangul syllables), I don't see the point in allowing them in
labels. They all get filtered out by NFKC (with the notable exception of
Old Hangul which should not belong imo in the IDN name space).
Based on this I don't even think they belong to the input set, because
of the confusion. The only difference between the input set and the
output set (if any) should be the uppercase forms for bicameral scripts.

Michel


More information about the Idna-update mailing list