emoji (was Re: I-D Action: draft-klensin-idna-rfc5891bis-00.txt)

John C Klensin klensin at jck.com
Sat Mar 18 17:44:11 CET 2017

--On Saturday, March 18, 2017 02:18 +0000 Shawn Steele
<Shawn.Steele at microsoft.com> wrote:

> I just got an escalated mail from a customer suggesting that
> if Windows didn't start supporting emoji IDN better, then they
> would need to move to a different platform.


I realized late yesterday (before your note arrived) that it may
be helpful to back up a little bit.  IDNs, from the very
beginning, were about allowing domain names to include labels
associated with a much broader range of languages and their
writing systems.   IDNA (both IDNA2003 and Stringprep and, to an
even greater degree, IDNA2008) was designed, reviewed, and
adopted against the background of that requirement.

Especially since you have argued that we should take more
guidance from Unicode because you believe they are the experts
(when it gets beyond coding and into classification and
applications, that becomes another issue -- see my recent
response to Asmus, especially the last few paragraphs), their
decision to classify emoji as symbols rather than letters (a
decision about which I share Asmus's belief that it was correct)
is very important.  If they were letters that, separately or in
combination, represented words in some human language, on could
argue that they belonged in IDNs and IDNA would have a
conceptual framework for dealing with them.  That doesn't say
they should be allowed there or not -- that is where the other
arguments Patrik, Andrew, myself, and others have made come in
-- but we would have a vocabulary, a framework, and a way of
determining success.

But, if emoji are not letters, but symbols, then they have no
role in IDNs.  They might well be appropriate as some other type
of extended domain name labels (I won't try to argue that here),
but they aren't IDNs.  I note that the ACE model was designed to
accommodate other types of extensions (as well as providing for
what we would do if IDNA turned out to be the wrong model) so I
look forward to your I-D proposing Emoji Domain Names, or
perhaps Graphic/Pictographic Domain Names more generally,
including how to sort out matching, analogies to normalization,
and so on, and following that I-D with a WG proposal. 

Given that there are Unicode code points involved, you should
still be able to use the Punycode algorithm with a different
prefix if that meets your needs.  Or, in principle, you could
think about EDNSx, a different label type and Class, using UTF-8
directly, and changing DNS matching rules.  While the coding is
important, it is the other stuff that is hard, especially the
issues about what things should be treated as equivalent to what
other things and how, that are hard and that have dominated the
discussions of mapping and "same name" relationships in the DNS
for years.  At least until emoji develop further as a language,
your best choice (certainly the only easy one and, btw, the one
effectively specified by UTR#46) might be exact,
codepoint-by-codepoint, matching with no transformations but one
of the things I'm sure of is that users don't expect, and
registries don't want, that one.

> Any more restrictive desires are going to have a difficult
> time overcoming those forces.  

Let me suggest two analogies, neither very good, that may help
in thinking about the problem with "those forces":

(i) Suppose the government of an important state of an important
country, with a population of around 72 million and, btw, an
important Microsoft Development Center, was so offended by their
perception of the way Unicode had treated the coding and related
rules about their dominant language that they offered to ban
sale of all Microsoft products in the state unless those
products were converted to support a CCS they considered more
appropriate.   How do you think the company would, or should,
respond to that sort of pressure?  Would you go back to code
pages with Unicode being only one option? (Hint: my
understanding is that the level of offense taken about Unicode
was an actual situation and the "start banning systems and
applications" plan was actively considered.)

(ii) Suppose that some major enterprise (and Microsoft customer)
decided that it needed to increase its (they say) legitimate
email marketing efforts, that the filtering arrangements in
Outlook and Exchange Server (and maybe elsewhere) were blocking
their messages from reaching users, and told you that, if
Microsoft didn't eliminate the problematic filters, they would
need to move to a different platform and try to convince other
large enterprises with an interest in email marketing to do so
too.   Would Microsoft give in?   Does the answer depend on how
large the company was?

Again, both examples are (fortunately) hypothetical and the
questions rhetorical, but I think they may be helpful in
thinking about the demand and your response.

Personally, I think you should consider that sort of request/
demand landing on your desk as an opportunity to educate both
your customer and your management as to why emoji in domain
names would be a bad idea, at least in the near term.   Using it
instead as the basis for sending what seems like a threat to
this mailing list appears less productive to me, especially
since there is no WG here that could to anything to change
things even if many of its participants wanted to.  If you think
it is appropriate, prepare the I-D and WG proposal suggested
above, but you would still need to explain to the customer why
that won't produce an immediate result.    

You might also want to encourage the appropriate people or
divisions within Microsoft to put pressure on ICANN to develop a
clear policy in this area and enforce it so that you aren't
singled out as the offenders.  I am not convinced that would be
a good idea, at least below the root, in general, but the
history of ICANN's making rules that it has no apparently will
or ability to enforce is not helping with these kinds of
situations either.


More information about the Idna-update mailing list