Browser IDN display policy: opinions sought
phoffman at imc.org
Fri Dec 9 16:34:18 CET 2011
On Dec 9, 2011, at 3:12 AM, Gervase Markham wrote:
> The policies fall into 3 approximate buckets:
> A (IE, Chrome): Unicode if the (single) 'language' of the string is
> configured in the options, Punycode otherwise.
> B (Firefox, Opera): Unicode if the TLD is in a whitelist, Punycode
> otherwise. Arbitrary script mixing permitted (registry policy used to
> prevent abuse).
> C (Safari): Unicode if the script is in a whitelist (which by default
> does not include Cyrillic or Greek), Punycode otherwise. Not sure about
> script mixing.
Without understanding both how a TLD gets on "a whitelist", and how "registry policy (is) used to prevent abuse", we cannot evaluate whether A or B would be better for Firefox. This information is critical to the analysis.
> By contrast, with a Type B policy, if your IDN domain works in one copy
> of Firefox, it works in them all. If everyone had Type B policies, there
> would be no risk of a properly-registered domain coming up as gibberish.
If Firefox (and Opera) were the only browsers that the site operator cared about, this would be good. However, I believe that is true for approximately 0% of the sites in the world. (The same would be true if there was a "D" that only applied to Chrome.)
> It has been suggested that Firefox switch to a Type A policy. As it is,
> the mix of policies means that the goal of universal acceptability is
> not being met anyway. Firefox switching to Type A would also not meet
> that goal by itself, but one could argue that there's a bit more
> consistency to browser behaviour.
That has been my feeling all along, although I stopped expressing it a while ago when it seemed like the Firefox team would never change. I'm glad to hear that the discussion is opening up.
Consistency will lead to better IDN adoption because users will be less surprised. As much as I hate guessing of the "language" of a string, A seems by far more sensible than B and I was happy to see that Chrome adopted it (although I admit I haven't done any head-to-head testing between Chrome and IE with interesting labels).
Absent anything convincing about how a TLD gets on "a whitelist", and how "registry policy (is) used to prevent abuse", I would hope that Firefox would join Chrome and IE with showing all single-script strings that it is believed that the user will understand.
More information about the Idna-update