Browser IDN display policy: opinions sought

Gervase Markham gerv at mozilla.org
Fri Dec 9 12:12:29 CET 2011


Recently, Mozilla community member Jothan Frakes was kind enough to do
some research about how different popular web browsers implement IDN,
and when they display the real characters and when they display
Punycode. This is in the context of a Mozilla review of our policy. I am
interested in the opinions of people on this list (see below).

As it turns out, the behaviour of all popular browsers is summarised at
the bottom a Chromium project document here:
http://www.chromium.org/developers/design-documents/idn-in-google-chrome

The policies fall into 3 approximate buckets:

A (IE, Chrome): Unicode if the (single) 'language' of the string is
configured in the options, Punycode otherwise.

B (Firefox, Opera): Unicode if the TLD is in a whitelist, Punycode
otherwise. Arbitrary script mixing permitted (registry policy used to
prevent abuse).

C (Safari): Unicode if the script is in a whitelist (which by default
does not include Cyrillic or Greek), Punycode otherwise. Not sure about
script mixing.


Firefox has historically resisted adopting a Type A policy because we
consider it seriously detrimental to IDN adoption and use. It seems to
me that IDN can never be reliable for site owners, and therefore will
not succceed, if a significant proportion of the world's browsers adopt
Type A or Type C policies. This is because site owners can never know
what proportion of their visitors will see gobbledegook in the URL bar
rather than their nice domain name. Perhaps for sites whose visitors are
all guaranteed to be from a particular country or language group, with
properly-configured browsers and OSes which know that they speak a
certain language or use a certain script, it might work - but I suggest
that's a small subset of all sites. Many people in non-English-speaking
countries still use English OSes and English browsers, with default
settings.

Type C is particularly bad - Russian and Greek IDNs are broken by
default, but even if you persuade your users to turn it on, they can
then be mixed-script spoofed. You get to choose between functionality
and security.

By contrast, with a Type B policy, if your IDN domain works in one copy
of Firefox, it works in them all. If everyone had Type B policies, there
would be no risk of a properly-registered domain coming up as gibberish.

It has been suggested that Firefox switch to a Type A policy. As it is,
the mix of policies means that the goal of universal acceptability is
not being met anyway. Firefox switching to Type A would also not meet
that goal by itself, but one could argue that there's a bit more
consistency to browser behaviour.

I would be interested in the opinion of people on this list as to:

- whether my analysis seems reasonable;
- whether they prefer type A, B or C; and
- whether they see any particular policy as more damaging to IDN
  adoption than another.

Has anyone lobbied one browser manufacturer or another to change their
policy? Is there another option that is not currently in use which would
be better?

(Note that "no restrictions" is not an option, given what happened in
2005 with payp-cyrillic-a-l.com, and I would rather not derail this
debate by rehearsing those arguments again.)

Gerv


More information about the Idna-update mailing list