Normalization of Hangul

Kent Karlsson kent.karlsson14 at comhem.se
Wed Feb 20 11:16:11 CET 2008


Yangwoo Ko wrote:
> As described in Section 3.12 of Unicode Standard, Hangul 
> syllable code 
> points are obtained by indexing through a 3-dimensional table. And 
> decomposition is just reverse of that operation. I don't know how to 
> describe that process by an algorithm other othan that is 
> given in UAX #15.

Hangul syllable canonical decompositions can be handled
**like all other canonical decompositions** by using
a table of 11172 entries that begins like this:

AC00; 1100 1161 # HANGUL SYLLABLE GA
AC01; AC00 11A8 # HANGUL SYLLABLE GAG
AC02; AC00 11A9 # HANGUL SYLLABLE GAGG
AC03; AC00 11AA # HANGUL SYLLABLE GAGS
AC04; AC00 11AB # HANGUL SYLLABLE GAN
...
And the last entry is:
D7A3; D788 11C2 # HANGUL SYLLABLE HIH

Using arithmetic is just an optimisation of that table.

	/kent k



More information about the Idna-update mailing list