Consensus Call Tranche 8 (Character Adjustments)

Wed Oct 15 13:56:51 CEST 2008

--On Tuesday, 14 October, 2008 20:04 +0900 Martin Duerst
<duerst at it.aoyama.ac.jp> wrote:

> At 18:25 08/10/12, Vint Cerf wrote:
>> Consensus Call Tranche 8 (character adjustments)
>> 
>> Place your reply here: [YES or NO]
> 
> [Vint, if you need to count this only one way,
> just count it as a NO to be on the safe side.]
> 
> 
> YES for 8.a and 8.b. Despite the transition issues
> mentioned by Mark, the long discussion on this list has
> shown that these are the right things to do in the long term.
> While I'm not aware of any concrete examples of similar
> cases, I think it would be worthwhile to check with other
> potentially affected script/language communities.
> What, for example, about the few final letters in Hebrew?

Or the many initial and final letters in Arabic?  The answer in
both cases is that these are individual characters and are
PROTOCOL-VALID.  What I believe got us into difficulty with
Eszett and Final Sigma wasn't the positioning issue or an
alternate shaping one but the intersection between them and the
case-folding rules.  Since, at least as of Unicode 3.2, neither
of them had upper-case forms and IDNA2003 violated the Unicode
Standard's advice against using case-folding to actually map
characters (rather than using it only in comparison but
retaining the original forms), the only result consistent with
the general IDNA2003 model was Eszett -> "ss" and Final Sigma ->
Medial Lower Case Sigma.

Since neither Hebrew nor Arabic (nor any of the other scripts
that have position-sensitive characters) have case, they cannot
get into the same problem.

Since we don't do case mapping in IDNA2008, the case folding
issue does not apply, regardless of what one thinks of that
operation and its applicability.  Without it, the only issue is
whether it is worth banning the characters to preserve part of
the IDNA2003 behavior (or making a major exception and
preserving the IDNA2003 mapping behavior) for the long term even
though it is clear that, were the decision being made for the
first time with the IDNA2008 rules, we would not even be asking
the question.

> NO for 8.c, for the reasons explained by Mark.
> KRNIC is free (or better, strongly recommended) to exclude
> conjoining Hangul from what they allow to register,
> but that should not influence our discussion too much.
> 
> [Just as a hopefully far-fetched example, assume that
> one day in North Korea, a few Hangul syllables containing some
> historic Jamos gains crucial importance.]

I'll have more to say about this in another note, but I would
assume that, were such a situation to arise, North Korea would
make an appearance in JTC1/SC2 and insist, in the most vigorous
terms, that code points be allocated to those crucial syllables.

  john