[Ext] RE: emoji (was Re: I-D Action: draft-klensin-idna-rfc5891bis-00.txt)
kim.davies at iana.org
Tue Mar 21 23:20:22 CET 2017
I thought ICANN was not permitting emoji registrations? Registries are required to support IDNA 2008 if they wish to implement IDN support and demonstrate compliance with it. The registries that don’t are either a few legacy situations, or are ccTLDs which ICANN has no role in setting policy for.
From the ICANN guidebook:
The label must be an A-label as defined in IDNA, converted from (and convertible to) a U-label that is consistent with the definition in IDNA, and further restricted by the following, non-exhaustive, list of limitations:
2.1.1 Must be a valid A-label according to IDNA.
2.1.2 The derived property value of all codepoints used in the U-label, as defined by IDNA, must be PVALID or CONTEXT (accompanied by unambiguous contextual rules).
2.1.3 The general category of all codepoints, as defined by IDNA, must be one of (Ll, Lo, Lm, Mn, Mc).
And in the associated footnote, “labels valid under the previous version of the protocol (IDNA2003) but not under IDNA will not meet this element of the requirements”.
"Idna-update on behalf of Shawn Steele" <idna-update-bounces at alvestrand.no on behalf of Shawn.Steele at microsoft.com> wrote:
(FYI: I'm going to be on vacation next week and hope not to reply then 😊)
The difficulty with your argument that I advocate against emoji to my company and customers is that I do not perceive emoji as a "bad idea" :) Certainly not at the magnitude you are suggesting. Therefore, it is difficult to me to argue that position.
Yes, emoji are not letters. Yes, registrars have been permitting emoji and allowing their registration. No, DNS has not fallen over because of that. And no, ICAAN has not come down with a big stick or something to try to prevent this.
I have seen no additional security bulletins or alerts because of emoji in IDN. Indeed, most of the phishing/spoofing attacks continue to be plain ASCII "yourbank.safe.info" type attacks.
I totally get the need/desire for a well behaved identifier so that if I want to find https://memory.loc.gov/ammem/alhtml/almss/dep001.html then I can actually find it. Well, kinda, probably, maybe. Many websites seem to restructure their content every few years, but the idea is sane. I don't see how allowing emoji in IDN harms this goal. Sure, it probably doesn't help, but harm?
I’m a little confused with your suggestion that the industry "get with the program", so to speak. (And that I should apparently lead the effort.) DNS customers have clearly demonstrated that they are quite willing to pay money to register emoji IDNs, and the other players seem willing to enable that behavior. The industry has spoken and they have chosen to do something that does not conform to a strict reading of the RFCs. Instead, they follow compatibility guidelines of The Unicode Consortium, which I might point out is another standards organization, and enabled by ICANN.
The position of these people is not a terrible surprise, it was discussed numerous times within the working group, however their wishes did not make it to the final RFCs. It is probably not surprising that the industry has been seeking other ways to extend the IDNA behavior to get their desired functionality.
The discussion of whether or not a company needs or should cater to the whims of their customers starts to get a little off-topic (but I did bring it up)...
To your guidance to educate the customer. (I'm not saying emoji is good or bad here, but talking in general about how customer requests impact business decisions).
Yes, sometimes governments do take extremely stupid positions and force the industry to cave to those demands or else. Often the governments can be gently informed as to their error. Sometimes that doesn't take. Occasionally companies choose to abandon those markets, even extremely large ones, when the restrictions are to onerous. Sometimes the industry knocks sense into the regulators. Sometimes they fail.
And sometimes it's not a government, but rather an important customer that requests a "feature" that does not seem wise. Usually the correct response is to politely educate them and help them to find a better solution to their problem. Sometimes they're obstinate and either unwilling or unable to adopt a better behavior. Sometimes there is money involved (most of these are, of course, for-profit ventures, and even not-for-profit ones need to pay the bills).
I am hearing an pretty consistent story from the registrars, ICANN, browser developers, Unicode, etc, that emoji is interesting in IDN. The only place I'm getting a hearing otherwise is here.
Total digression: Heck, Pizza Hut messed up my order so I was inspired to do their survey and I mistyped the URL. The first link in the search results was a spoofing site pretending to have their survey. (pizzahutsurvey.surveysRus.com or something). Indeed, it was "good enough" that I was confused why the URL was different and thought maybe they'd outsourced their survey. Being one of the 0.1% of the people that actually understand the danger here, I went ahead and retyped it and landed on the real site. The real site was completely different, but the other seemed just as "professional", and could have easily phished email/phone/etc from disgruntled Pizza Hut customers. *I* almost didn't catch it (though I confess if I'd clicked a couple pages further I probably would have). No emoji were harmed in that experience.
From: John C Klensin [mailto:klensin at jck.com]
Sent: Saturday, March 18, 2017 9:44 AM
To: Shawn Steele <Shawn.Steele at microsoft.com>; Patrik Fältström <paf at frobbit.se>
Cc: idna-update at alvestrand.no; Andrew Sullivan <ajs at anvilwalrusden.com>
Subject: RE: emoji (was Re: I-D Action: draft-klensin-idna-rfc5891bis-00.txt)
--On Saturday, March 18, 2017 02:18 +0000 Shawn Steele <Shawn.Steele at microsoft.com> wrote:
> I just got an escalated mail from a customer suggesting that if
> Windows didn't start supporting emoji IDN better, then they would need
> to move to a different platform.
I realized late yesterday (before your note arrived) that it may be helpful to back up a little bit. IDNs, from the very beginning, were about allowing domain names to include labels associated with a much broader range of languages and their
writing systems. IDNA (both IDNA2003 and Stringprep and, to an
even greater degree, IDNA2008) was designed, reviewed, and adopted against the background of that requirement.
Especially since you have argued that we should take more guidance from Unicode because you believe they are the experts (when it gets beyond coding and into classification and applications, that becomes another issue -- see my recent response to Asmus, especially the last few paragraphs), their decision to classify emoji as symbols rather than letters (a decision about which I share Asmus's belief that it was correct) is very important. If they were letters that, separately or in combination, represented words in some human language, on could argue that they belonged in IDNs and IDNA would have a conceptual framework for dealing with them. That doesn't say they should be allowed there or not -- that is where the other arguments Patrik, Andrew, myself, and others have made come in
-- but we would have a vocabulary, a framework, and a way of determining success.
But, if emoji are not letters, but symbols, then they have no role in IDNs. They might well be appropriate as some other type of extended domain name labels (I won't try to argue that here), but they aren't IDNs. I note that the ACE model was designed to accommodate other types of extensions (as well as providing for what we would do if IDNA turned out to be the wrong model) so I look forward to your I-D proposing Emoji Domain Names, or perhaps Graphic/Pictographic Domain Names more generally, including how to sort out matching, analogies to normalization, and so on, and following that I-D with a WG proposal.
Given that there are Unicode code points involved, you should still be able to use the Punycode algorithm with a different prefix if that meets your needs. Or, in principle, you could think about EDNSx, a different label type and Class, using UTF-8 directly, and changing DNS matching rules. While the coding is important, it is the other stuff that is hard, especially the issues about what things should be treated as equivalent to what other things and how, that are hard and that have dominated the discussions of mapping and "same name" relationships in the DNS for years. At least until emoji develop further as a language, your best choice (certainly the only easy one and, btw, the one effectively specified by UTR#46) might be exact, codepoint-by-codepoint, matching with no transformations but one of the things I'm sure of is that users don't expect, and registries don't want, that one.
> Any more restrictive desires are going to have a difficult time
> overcoming those forces.
Let me suggest two analogies, neither very good, that may help in thinking about the problem with "those forces":
(i) Suppose the government of an important state of an important country, with a population of around 72 million and, btw, an important Microsoft Development Center, was so offended by their perception of the way Unicode had treated the coding and related rules about their dominant language that they offered to ban sale of all Microsoft products in the state unless those products were converted to support a CCS they considered more
appropriate. How do you think the company would, or should,
respond to that sort of pressure? Would you go back to code pages with Unicode being only one option? (Hint: my understanding is that the level of offense taken about Unicode was an actual situation and the "start banning systems and applications" plan was actively considered.)
(ii) Suppose that some major enterprise (and Microsoft customer) decided that it needed to increase its (they say) legitimate email marketing efforts, that the filtering arrangements in Outlook and Exchange Server (and maybe elsewhere) were blocking their messages from reaching users, and told you that, if Microsoft didn't eliminate the problematic filters, they would need to move to a different platform and try to convince other large enterprises with an interest in email marketing to do so
too. Would Microsoft give in? Does the answer depend on how
large the company was?
Again, both examples are (fortunately) hypothetical and the questions rhetorical, but I think they may be helpful in thinking about the demand and your response.
Personally, I think you should consider that sort of request/ demand landing on your desk as an opportunity to educate both your customer and your management as to why emoji in domain
names would be a bad idea, at least in the near term. Using it
instead as the basis for sending what seems like a threat to this mailing list appears less productive to me, especially since there is no WG here that could to anything to change things even if many of its participants wanted to. If you think it is appropriate, prepare the I-D and WG proposal suggested above, but you would still need to explain to the customer why
that won't produce an immediate result.
You might also want to encourage the appropriate people or divisions within Microsoft to put pressure on ICANN to develop a clear policy in this area and enforce it so that you aren't singled out as the offenders. I am not convinced that would be a good idea, at least below the root, in general, but the history of ICANN's making rules that it has no apparently will or ability to enforce is not helping with these kinds of situations either.
Idna-update mailing list
Idna-update at alvestrand.no
More information about the Idna-update