Mapping and Variants

John C Klensin klensin at jck.com
Mon Mar 9 03:29:34 CET 2009



--On Monday, March 09, 2009 10:27 +0900 Martin Duerst
<duerst at it.aoyama.ac.jp> wrote:

> Sorry, I didn't mean 'prohibit' as legislate, but as "the
> registries should do it themselves in everybody's best
> interest". Also, I didn't mean that e.g. having Cyrillic
> labels and Arabic labels in the same zone would be bad, just
> that having different scripts in the same label (in particular
> for Latin, Cyrillic, and Greek) would be bad. Therefore, I'm
> still waiting for John to confirm that his example has been
> just theoretical, or to explain why not.

The example, if I'm correctly identifying the one you are
referring to, was not theoretical, but the main issue is that it
is very hard (or meaningless) for the WG to try to tell zones
what to do.  Remember that there are tens of millions of zones
out there.  ICANN can control about a dozen in theory and about
250 more by some stretch of optimistic imagination.   Beyond
that, decisions are going to get made by local zone
administrators based on what they think serves the needs of the
registrants, users, or enterprise-owners of their zones.

If we think something is important enough that it should be a
DNS-wide rule, we need to incorporate it into the registration
protocol.  If we think it is important enough that it should be
enforced or followed even if registries have inclinations to the
contrary, then it needs to be a protocol rule that is checked at
lookup.  For anything else, we run into the fact that people
have asserted that Latin and Cyrillic characters are used
together often enough in Russia and it is necessary to permit
them together in labels.  I would advise against that, you would
advise against that, probably even the current RU TLD registry
would advise against that, but it doesn't prevent the claim from
being made and probably won't prevent some third- or
fourth-level registry from doing it if they think it is
interesting.  Conversely, as a purely hypothetical example (see
below) I would strongly advise against permitting registration
of toys-я-us (the middle character is SMALL LETTER YA, U+044F),
especially in a broad-scope-registration gTLD, not just because
of the script mixing but because it would require a debate about
whether it ought to be bundled with toys-r-us.  

But, if the registry permitted mixed-script registrations at
all, even for Japanese (which, as you have pointed out, is one
of the important cases and one unlikely to actually cause
confusion), then, if the relevant hypothetical company had
decided it wanted the name,  I would expect its lawyers to show
up at the doorstep of the registry pointing to its trademarks,
claiming the "right" to register that name, and probably
claiming that they had already been done irreparable harm by not
being permitted to register Toys"R"Us by some obscure technical
rule dating from long before any self-respecting lawyer had
heard of the Internet.   I don't know how that discussion would
come out (I actually have reason to that real Toys"R"Us is
smarter than to drag itself into something like this, which is
why the example is hypothetical), but I think we would all be
very relieved that the WG didn't have to be part of it.

So we know some cases, like Japanese, in which mixed-script
registrations are necessary for practical use of the DNS.  We
know some other cases where it will certainly be claimed that
such registrations are necessary (and, for a hypothetical
manufacturer of electronic equipment, the Greek-Latin examples I
gave are actually not that far-fetched... at third or fourth
level registrations, even if not at second or TLD level).  And
we know that a debate with an administrator of such a
manufacturer zone about whether or not μvolt should be
permitted isn't going to get us anywhere that we want to be. I
predict we will have enough problems with the mathematical site
that wants to use "א-null" as a label (flunks at least the
IDNA2003 Bidi tests) or U+05DO U+0030 (Aleph followed (to the
right) by digit zero, which my system won't even let me type,
and those run into Bidi rules, not just script-mixing ones.

Net, I think we need to strongly recommend that registries/zones
not permit script mixing unless the particular cases involved
are heavily enough used in their cultures to be both important
and non-problematic.  Someone may need to point out to various
consumer protection and law enforcement agencies that ignoring
that recommendation can lead to bad stuff.  But I don't see any
basis for going further, nor any way to do it unless we are
going to insert our judgment between registries and the lawyers
for folks who want to represent trademarks or terms of art that
use mixed scripts and then to start making lists of permitted
cases.

    john







More information about the Idna-update mailing list