I think this conversation is muddying the waters thoroughly.<br><br>When normalization was defined, it was clear that it would not do everything that everyone could possibly have wanted. Speaking to John's "casual reader", what Kent is talking about is something that Kent has raised repeatedly before as something he would have liked for normalization to have done.<br>
<br>But it wasn't done, and won't be done, and has no impact on the stability or utility of normalization.<br><br>Mark<br><br><div class="gmail_quote">On Wed, Feb 20, 2008 at 3:59 PM, Kent Karlsson <<a href="mailto:kent.karlsson14@comhem.se">kent.karlsson14@comhem.se</a>> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="Ih2E3d">John C Klensin wrote:<br>
> I hope we all understand the subtleties of this thread generally<br>
> and your specific comments above in particular. However, to a<br>
> casual reader, it sounds very much like "Hangul and the<br>
> surrounding operations are still unstable (in the normal, not<br>
> necessarily Unicode, sense of that term) and that four full<br>
> versions of Unicode after the drastic changes to Hangul<br>
> handling, there still isn't a definition of the processes of<br>
> normalization and comparision other than by a set of ad-hoc<br>
> tables which are not quite complete".<br>
><br>
> Presumably, that isn't what was intended, but...<br>
<br>
</div>Just to clarify:<br>
<br>
The Hangul script has 17 consonant letters and 11 vowel letters,<br>
plus a small number of variants added later that since have gone<br>
out of use. Apart from the short-lived extra variants (and the<br>
merge of ieung and yesieung), this has been stable for over 550<br>
years, since 1446 (though the spelling of Korean in Hangul has<br>
not been stable, nor has the ordering of the letters, but those<br>
are different matters). The, very deliberate, design of the<br>
script is very elegant.<br>
<br>
One also needs a way of determining syllable boundaries<br>
(doubly encoding the consonants, as in Unicode now, is fine).<br>
The Jamo fillers have a function too (for partial syllables).<br>
<br>
The rest of the encoded Hangul characters are unnecessary for<br>
representing any text in the Hangul script (modern or historic),<br>
aside from halfwidth (which is really just a display style).<br>
The HANGUL SYLLABLEs have canonical decompositions, so they<br>
are not so bad.<br>
<br>
No other script has gotten lots of codes for multi-letter<br>
combinations. "gg" is not eligable for encoding as a single<br>
character, nor is "ou" or "sk", etc. But Hangul has been<br>
given hundreds of extra codes for letter combinations. Most<br>
of these letter combinations only occur in historic texts.<br>
Unfortunately canonical decompositions for the Hangul letter<br>
combinations are missing and cannot now be added. This is<br>
very far from elegant...<br>
<br>
<br>
/kent k<br>
<div><div></div><div class="Wj3C7c"><br>
_______________________________________________<br>
Idna-update mailing list<br>
<a href="mailto:Idna-update@alvestrand.no">Idna-update@alvestrand.no</a><br>
<a href="http://www.alvestrand.no/mailman/listinfo/idna-update" target="_blank">http://www.alvestrand.no/mailman/listinfo/idna-update</a><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>Mark