Consensus Call Tranche 8 (Character Adjustments)

John C Klensin klensin at jck.com
Thu Oct 16 17:52:49 CEST 2008


Ken,

With apologies for kicking a dead horse, but in the hope that
clarifying my concern might help with considerations in Korea...

--On Wednesday, 15 October, 2008 14:18 -0700 Kenneth Whistler
<kenw at sybase.com> wrote:

> John Klensin wrote:
>...
 
>> I am a bit concerned about the hypothetical case that Martin
>> raised and my reaction, at least if I correctly understand
>> Unicode's stability rules.    If a few syllables that are now
>> considered archaic (or, if such cases exists, ones that have
>> never been used) abruptly become, to use Martin's term, of
>> crucial importance, would the syllable forms  be allocated
>> code points?
> 
> No.

Martin's hypothetical case focused on a future requirement from
the North about syllables that become critical.   I can't speak
to the specific experience of JTC1/SC2 in holding the line, but
long experience with ISO and ISO/IEC JTC1 procedures and
behavior has persuaded me that one should not lightly discount
the ability of a national member body to eventually get its way
on a subject it considers important to national interests.

>> If so, am I correct in assuming that stability rules
>> would require that NFC would actually decompose the
>> newly-added syllables (presumably composing the individual
>> Jamo to the new syllables would result in an incompatible
>> change to normalization)?
> 
> Counterfactual. But yes, if it *were* the case (which it
> isn't), then addition of a new precomposed Hangul syllable
> would then require addressing normalization stability. The
> exact details of how that would be done are unclear, because
> any new precomposed Hangul syllable would, by definition, be
> outside the context of the Hangul Syllable Composition and
> Hangul Syllable Decomposition algorithms (TUS 5.0, pp. 122 -
> 124) which *define* the normalization relationship between
> conjoining jamos and the 11,172 precomposed Hangul syllables.

As I understand this very clever set of algorithms, they depend
critically on the ordering and density of the Hangul syllables;
it is not clear how one would add a new one without completely
disrupting the system and requiring changes much more basic than
the normal situation for adding a few new characters.  Perhaps
that is just a different way of saying what you meant by "The
exact details...unclear".

>> That isn't an attractive answer because it
>> makes the behavior dependent on when a particular character
>> code point is added to Unicode.
> 
> Also counterfactual, because such characters will not be added
> to the Unicode Standard. Nobody in the UTC *or* in Korea
> (South or North) is asking for them.
> 
> In fact, if you read the new Korean standard, KS X 1026-1:2007,
> "Part 1, Hangul processing guide for information interchange",
> that standard *mandates* that for Old Hangul syllable blocks
> a sequence of three Jamos be used:
> 
> "5.2 A representation format of Modern Hangul syllable blocks
> 
> "For representing Modern Hangul syllable blocks, we must use
> code positions of 11,172 Hangul syllables U+AC00 ~ U+D7A3. ...
> 
> "5.3 A representation format of Old Hangul syllable blocks
> 
> "For representing Old Hangul syllable blocks, we must use
> code positions of Johab Hangul letters in Hangul Jamo U+1100 ~ 
> U+11FF, Hangul Jamo Extended-A U+A960 ~ U+A97F, and Hangul Jamo
> Extended-B U+D7B0 ~ U+D7FF, ..."
> 
> That isn't something that the UTC wrote in the Unicode
> Standard -- it is what the Korean Agency for Technology &
> Standards wrote in a *Korean* standard.

Understood.  But Martin's example was talking about a future
situation, a future in which it seems unwise to assume that,
absent complete unification, a government and national member
body Pyongyang would automatically accept and recognize a
standard developed by the government and national member body in
Seoul.

>> However, I note that
>> prohibiting the Jamo in IDNA would prevent the problem, at the
>> cost of requiring anyone who wants to use a syllable that is
>> not now assigned a code point in a domain name to persuade
>> UTC and SC2 to add that  code point.
> 
> Which will never happen.

See above about "never" where the combination of a national
member body and JTC1 are concerned.  

> See above. Prohibiting the Jamos in IDNA would prevent the
> usage of Old Hangul syllable blocks in domain names, period.
> And frankly, I consider that well within the purview of
> registry policies in Korea, if that is what they want to do.

>...
>> However, I don't think the answer is
>> quite as obvious and one-sided as your note seems to imply.
> 
> Noted. But I disagree and consider this one obvious. I will
> return to Andrew Sullivan's point. If you think it is within
> the competence and purview of this particular working group
> to decide that the *protocol* should prohibit a certain
> subset of historical Old Hangul syllables from representation
> in domain names, then we may as well reopen the discussion
> about the appropriateness of the protocol letting
> Sumero-Akkadian cuneiform, Linear B syllables, or other
> historic scripts in domain names.

I am not arguing for the competence and purview of the WG to
start making up selective prohibitions.  I am suggesting that,
when there are cognizant registries (especially at the top
level) and governments for which a living language is primary,
we should be very careful about ignoring their advice (although
we should also be sure we and they understand the advice and its
implications in the same way).    As I believe you pointed out
in a prior discussion, we are unlikely to encounter such
registries or governments where Sumero-Akkadian cuneiform,
Linear B syllables, etc., are concerned, so the situations are
somewhat different.

      john




More information about the Idna-update mailing list