Tables and contextual rule for Katakana middle dot
John C Klensin
klensin at jck.com
Wed Apr 8 21:58:04 CEST 2009
--On Tuesday, April 07, 2009 15:41 -0700 Paul Hoffman
<phoffman at imc.org> wrote:
> So we are now back to choosing characters based on visual
> confusion? How the heck did we get here?
Not from me, at least intentionally. However, an extension of
the comments in that note about accepting the general rules in
Tables and making exceptions when needed is that, if someone
comes in and says "please make an exception and allow this",
then I think a case that there might be visual confusion is fair
as an argument for saying "no". Or, to put it differently, the
burden is on those who propose an inclusion exception to
overcome any reasonable objections to doing so -- not
necessarily by proving those objections wrong, but by showing
that, on balance, the tradeoffs favor inclusion.
I believe that whatever systems we adopt should be biased
against exceptions, if only because any other path leads to both
madness and a higher likelihood of having to do serious work
with new versions of Unicode. YMMD.
>> while noting that there are lots of punctuation
>> characters in lots of scripts and contexts that people would
>> like to use as word separators, including, curiously, U+002E.
> Please be clear: are you proposing that we take out the
> punctuation (Po) characters from PVALID and CONEXTEO? Or are
> you just "noting"?
As I just stated in another note, I believe that we should not
revisit _any_ IDNA2008 decision that is reflected in Tables or
Protocol unless either (i) new considerations or arguments are
introduced or (ii) Vint believes that it was never settled
and/or that there has been continually controversial. At some
point, the appearance of some provision in multiple versions of
a document with no comments on the list has to be interpreted as
rough consensus. And, if every new suggestion about anything,
including the introduction of the "mapping" discussion, opens up
the door to our reviewing and re-discussing every decision we
seem to have made in the past, we will never converge.
So, to address your specific question above:
(i) Exclusive of exceptions, all Po characters are
DISALLOWED. I see no reason to revisit that decision.
(ii) A few Po characters have been classified as either
PVALID or CONTEXTO by exception. I see no reason to
revisit that decision for any character but KATAKANA
MIDDLE DOT. The reason for revisiting that character is
that we have new information and that new information
suggests that the character is needed in a much broader
range of contexts than the rule in Tables-05 permits.
Is that adequately clear?
More information about the Idna-update