[Almost OT] Re: Hangul jamo issues - are jamo sequences legitimate?

Soobok Lee lsb at lsb.org
Thu Jan 11 01:13:16 CET 2007


On Wed, Jan 10, 2007 at 10:22:07AM -0800, Michel Suignard wrote:
> > Moreover, U+31xx compat jamo letters are the only input method
> > for jamo chars under NFC and KSC5601. We have no direct input
> > method for U+11xx,which is not in KSC5601->UNICODE table.
> >
> > So, U+31xx and U+11xx both should be allowed in labels.
> >
> You could possibly argue for having them as input, but because they get
> normalized into Hangul syllables by NFKC (except for rare old hangul
> syllables which can only be represented by Jamo or a mix of jamo and
> modern hangul syllables), I don't see the point in allowing them in
> labels. They all get filtered out by NFKC (with the notable exception of
> Old Hangul which should not belong imo in the IDN name space).
> Based on this I don't even think they belong to the input set, because
> of the confusion. The only difference between the input set and the
> output set (if any) should be the uppercase forms for bicameral scripts.

Under NFKC, you are right. But, this IDNAbis may have NFC instead of NFKC,
because NFKC changes the glyphs like in the case of circled A -> A.
NFC preserves the glyphs (display) of input characters.
My all previous arguments are based on adoption of NFC in IDNAbis.

Soobok


More information about the Idna-update mailing list