Normalization of Hangul

Wed Feb 20 15:36:19 CET 2008

--On Wednesday, 20 February, 2008 11:39 +0100 Kent Karlsson
<kent.karlsson14 at comhem.se> wrote:

> Martin Duerst wrote:
>> Section 16 of TR 15 cleary says that this is sample code, not
>> part of the spec, although some of the later wording isn't
>> always clear about this.
> 
> That may be because some aspects of the composition was
> not clear from the specification in TUS but instead had to
> be specified in UAX 15...
> 
> This has been (somewhat) "clarified" in draft Unicode 5.1.
> 
> 
> One should also note as a side issue to this, that even
> though, for example, U+1101 (HANGUL CHOSEONG SSANGKIYEOK) is
> logically *exactly* the same as <U+1100, U+1100> (<HANGUL
> CHOSEONG KIYEOK, HANGUL CHOSEONG KIYEOK>), and many more
> equivalences like it, that is not recorded in any way by the
> Unicode data. That many implimentations may choke on the
> latter form, does not make those representations different
> when seen from the point of view of the design of Hangul (and
> Hangul was carefully designed...).

Kent,

I hope we all understand the subtleties of this thread generally
and your specific comments above in particular.  However, to a
casual reader, it sounds very much like "Hangul and the
surrounding operations are still unstable (in the normal, not
necessarily Unicode, sense of that term) and that four full
versions of Unicode after the drastic changes to Hangul
handling, there still isn't a definition of the processes of
normalization and comparision other than by a set of ad-hoc
tables which are not quite complete".  

Presumably, that isn't what was intended, but...

     john