Mapping and Variants
John C Klensin
klensin at jck.com
Tue Mar 10 20:36:44 CET 2009
--On Tuesday, March 10, 2009 11:55 -0700 Mark Davis
<mark at macchiato.com> wrote:
> It is not banned, and has not been banned (I think) in any of
> the many drafts (John can say precisely).
It has never been banned. Much of the reason was the cases you
identify, including the issues about how to handle Common, etc.
There are also some others that may be less persuasive but,
nonetheless, there are people who insist that they "want" them.
There is another reason as well. The idea of mixed-script
labels seems to be sufficiently popular among precisely those
Bad Guys whom one would like to prevent from using them that
no-mixing conditions as part of IDNA would really have to be
checked in the lookup protocols to make them effective. But, to
do it there would require dealing with a sufficient number of
edge cases, exceptions, disputes about what constitutes mixing
and what does not (part of Vint's concern about IPA -- a
variation on a concern that was expressed on the list over a
year ago-- is relevant here), and so on to make a lookup-time
check completely infeasible... to say nothing of how much we
could bog ourselves down debating more general questions about
lookup-time bans on confusable characters.
As you said...
> I firmly agree with you as to the need for examples, but those
> have been supplied before, many many times, just not in that
> The volume does make it difficult to follow - I know I've
> spent vastly more time on this than anticipated: just reading
> each of the new specs carefully takes a lot of time - and the
> main authors and chair are clearly overloaded.
Obviously, two of the things that could be done that would help
hugely (and almost all of us are guilty -- I'm not criticizing
anyone in particular here) would be to avoid revisiting the same
issues over and over again and at great length and to avoid
spending huge amounts of time on topics that are either clearly
out of scope or whose scope-overlap is very restricted.
To take one painful example, I don't believe that I've learned
anything significant to the WG in the recent revisiting of
Eszett and Final Sigma. I've gotten a much better
understanding of the details of the perspectives of some of the
registries involved, which I appreciate, but the bottom line
appears to me to be the same as it was over a year ago: the
interactions between desires for proper orthography,
preservation of information about characters that is sometimes
important, compatibility with IDNA2003 (whether one believes the
decisions made there were correct or not), etc., add up to a
very complex set of tradeoffs and to a situation in which
nothing that we could possibly do will make everyone happy all
of the time. In those situations, we just have to figure out
how to decide and move on (as I assume Unicode did, mostly in a
matching rather than mapping context, with CaseFold itself).
Instead, we keep looping, proving the wisdom of the original
IDNA assumption that the WG would avoid dealing with individual
code points... only, here, I just don't see how to avoid it
given the observation that eliminating mapping forces
transitions either way.
More information about the Idna-update