Jamo [RE: Consensus Call Tranche 8 (Character Adjustments)]

Kent Karlsson kent.karlsson14 at comhem.se
Fri Oct 17 09:45:15 CEST 2008


Michel SUIGNARD wrote:
> I would like to know where in ISO/IEC 10646 the type of 
> sequence described in 3 is ‘allowed’ to represent such Hangul 
> syllables. Because to the best of my knowledge it is not.

10646 is rather silent on that mattar. But see the Unicode
standard. In version 5.0 this is discussed in section 3.12,
"Conjoining jamo behaviour". The key sentence there states:

Unicode> Standard Korean syllable block: A sequence of one or more L
Unicode> followed by a sequence of one or more V and a sequence of zero
Unicode> or more T, or any other sequence that is canonically equivalent.


KIM, Kyongsok wrote:
> ... each of the following three can represent Hangul syllable GGA:
> 1) UAC01 (GGA)
> 2) U1101 (GG), U1161 (A)
> 3) U1100 (G), U1100 (G), U1161 (A)
>  - By NFC, 2) U1101 (GG), U1161 (A) will be changed to 1) UAC01 (GGA);
>  - However, by NFC, 3) U1100 (G), U1100 (G), U1161 (A) will 
> be changed to 
> U1100 (G), UAC00 (GA), which is "different" from 1) UAC01.

This is indeed the correct analysis. I find it very unfortunate
that U1101 (GG) does not have a *canonical* decomposition mapping
to <U1100 (G), U1100 (G)> (etc. for all the other multi-letter
Hangul Jamos). The Hangul script does NOT have a primitive Jamo
GG. The Hangul GG is, by design, composed of two G Jamos, just
like Latin GG is composed of two G letters.

	/kent k



More information about the Idna-update mailing list