KATS (Korean Agency for Technology and Standards)'s Comments on theUnicode Codepoints and IDNA Internet-Draft

Kenneth Whistler kenw at sybase.com
Sat Nov 1 00:23:16 CET 2008


O.k., since this is being rapidly pushed for a consensus
decision, I will weigh in as well.

I think this entire discussion has been pushed forward with
a large number of misconceptions.

I contend that this is *not* a normalization issue at all.

I contend that this *is* a phishing/spoofing issue, as is
explicitly explained in the KATS input document that
started this thread. The examples in that document are
all phishing issues that result from the fact that
Old Hangul jamo characters contain a number of explicit
variant letters of some historic importance for Old Hangul,
but which are confusable with the standard shapes of
the modern jamos.

As a phishing issue, I also strongly agree with Andrew
Sullivan's original contention. This, by rights, should be
a policy issue, and not a protocol issue -- and effectively
what the group is moving towards here is a character-by-character
position on ruling out spoofability -- a position that it
categorically rejected some time ago when dealing with
similar issues for spoofability within and across other
scripts in the standard.

Also, I strongly disagree with John Klensin's summary
about "letter" status for Hangul and his attempt to
use that as a basis to make a principled distinction
in this case for jamos. Kent Karlsson's analysis was
correct.

All that said, I do not think this is worth breaking
consensus in the group about completion of the
protocol documents, because there is no
burning need to include Old Hangul in IDN's anyway.
(But then there was no burning need to include
any of the other historic scripts, either, nor many
hundreds of confusable historic letters for Latin,
for that matter.)

So personally, I would not oppose making an exception
for Jamos, as long as everybody is clear that it
is nothing more than that: a special case exception
to the other principles of the table derivation.
Just don't bother trying to patch this up as some
general principle: it is what it is -- an exception
for Jamos for Korean.

So just add the range U+1100..U+11FF to the tables
document as DISALLOWED and be done with it.

Oh, lest I forget... you won't *quite* be done with
it. Assuming that the tables document and the rest of
the IDNA protocol documents are finally completed this
year or early next, be prepared to start the revision
of the tables document next year, because there are
two more blocks of Old Hangul jamos coming in
Unicode 5.2: U+A960..U+A97F and U+D7B0..U+D7FF,
courtesy of a character encoding request from KATS.
One of the reasons why I wanted to treat the exclusion
of jamos as a policy issue, rather than something
baked into the tables document for the protocol was
precisely because we haven't heard the last of jamos.
Those two blocks will be arriving in Unicode 5.2, and
there is always the possibility that some historian
of Old Hangul may yet find a few more that need to
be added in the indefinite future to the standard.

--Ken



> On 31 okt 2008, at 18.32, Dae Hyuk Ahn wrote:
> 
> > I have exactly same opinion with John and Michael.
> >
> > Thanks,
> > Dae Hyuk Ahn. Ph.D.
> >
> >
> > On 08. 10. 31 ¿ÀÈÄ 11:26, "Michael  
> > Everson" <everson at evertype.com> wrote:
> >
> > On 31 Oct 2008, at 14:06, John C Klensin wrote:
> >
> >> Finally, while we associated terms like "phishing" with our "leave
> >> it to the registries" principle, I believe that the Korean case for
> >> exclusion of Jamo has been clearly made and that we should not
> >> attempt to fault (or punish) the rather considerable effort that has
> >> been made here because they didn't understand the idiosyncrasies of
> >> the vocabulary used in the WG.
> >
> > I also believe that a case for exclusion of Jamo has been clearly  
> > made.
> >
> > Michael Everson * http://www.evertype.com



More information about the Idna-update mailing list