[Ext] RE: emoji (was Re: I-D Action: draft-klensin-idna-rfc5891bis-00.txt)

John C Klensin klensin at jck.com
Wed Mar 22 01:48:15 CET 2017



--On Tuesday, March 21, 2017 19:09 -0400 Andrew Sullivan
<ajs at anvilwalrusden.com> wrote:

> On Tue, Mar 21, 2017 at 10:20:22PM +0000, Kim Davies wrote:
>> 
>> I thought ICANN was not permitting emoji registrations?
>> Registries are required to support IDNA 2008 if they wish to
>> implement IDN support and demonstrate compliance with it.
> 
> Only gTLDs have to support IDNA2008. ccTLDs don't _have_ to do
> anything.

And, that said, it isn't even clear whether ICANN would, in
practice, be willing and able to enforce the contracts with the
"contracted parties".   Evidence from the past would suggest
"no".

As I see it, there are two problems underlying the difference
between your point of view and Shawn's.  On the one hand, not
only does IDNA2008 prohibit registration of labels containing or
consisting of emoji, but so does IDNA2003.  However, most or all
of the browsers, with the support of WHATWG, don't do the lookup
checks that IDNA2008 requires and ICANN has even less control
over them than it does over ccTLDs.  Without the lookup checks,
if the emoji make it into the DNS, they will usually be looked
up and successfully resolved by browsers (not necessarily other
applications).

In part because the DNS is fairly robust and in part because of
the way IDNA works, registration and delegation of emoji that
are pushed through the Punycode algorithm isn't going to cause
the DNS to "fall over".  The same comment could be made about
registration and delegation of @#$%^.example, ab\.\.cd".example
(i.e., "ab..cd" as a single label), micrοsоft.com (the
circular characters there are U+03B1 and U+043E respectively).
Each of them violates either IDNA2008 or some IETF, ICANN, or
other guideline about good practice, but each one can be put
into a DNS label, and, if one knows exactly what to look up, can
be looked up.   From the standpoint of a registrar who wants to
sell the things and who doesn't care much about risks and
consequences to registrants or users, that is enough -- one can
put them in and get them out and the fact that some
implementations of at least some applications won't tolerate
them is of little interest (and has been since ICANN assumed a
role in this space.  For those registrars, even the
"micrοsоft.com" example is not a problem because they've
succeeded in making trademark conflicts, even blatant ones, the
problem of the registrant and they get to sell an extra name,
perhaps to a party that cannot later be identified and help
accountable (another violation of ICANN rules as I remember
them).

Even in that case, which I find fairly disturbing, the DNS works
just fine: the user who looks up "microsoft.com" gets one set of
records, the user who looks up "micrοsоft.com" gets a
different set of records, but each one gets what they asked for
so the DNS is working fine.

The thing that may have prevented micrοsоft.example from being
a problem isn't that it harms the DNS operationally (although
what it does reputationally may be another matter).  It is that
no attacker has chosen to mount that particular attack.  Yet.
Part of what may have prevented them from doing it is a
combination of two observations: that "yourbank.safe.info" is
just an easer target as long as it works often enough and, if
the "example." TLD gets a reputation for allowing (or
encouraging) that sort of nonsense, the browsers may respond by
displaying the putative A-labels and the odds that a typical
user will confuse "microsoft.com" and "xn--micrsft-cpf85i.com"
are not very high.

THe emoji raise a different set of issues than "micrοsоft.com"
but one that is less different than it appears.  Because there
is no agreement about a standard display representation for many
code points and issues with modifiers and ZWJ stacking may
complicate things further, code sequences that are different may
end up perceived as the same and vice versa, especially if we
assume that a registrar and registry who are willing to ignore
IDNA and ICANN guidelines or requirements isn't likely to pay a
lot of attention to Unicode rules about what combinations are
allows and which ones are not, especially when those rules are
not easily understood.

Maybe you, Shawn, are right and none of us should care.  After
all, it is what registrars and some users want to do and the DNS
won't fall over.

best,
    john



More information about the Idna-update mailing list