Normalization of Hangul
Kent Karlsson
kent.karlsson14 at comhem.se
Wed Feb 20 11:16:11 CET 2008
Yangwoo Ko wrote:
> As described in Section 3.12 of Unicode Standard, Hangul
> syllable code
> points are obtained by indexing through a 3-dimensional table. And
> decomposition is just reverse of that operation. I don't know how to
> describe that process by an algorithm other othan that is
> given in UAX #15.
Hangul syllable canonical decompositions can be handled
**like all other canonical decompositions** by using
a table of 11172 entries that begins like this:
AC00; 1100 1161 # HANGUL SYLLABLE GA
AC01; AC00 11A8 # HANGUL SYLLABLE GAG
AC02; AC00 11A9 # HANGUL SYLLABLE GAGG
AC03; AC00 11AA # HANGUL SYLLABLE GAGS
AC04; AC00 11AB # HANGUL SYLLABLE GAN
...
And the last entry is:
D7A3; D788 11C2 # HANGUL SYLLABLE HIH
Using arithmetic is just an optimisation of that table.
/kent k
More information about the Idna-update
mailing list