Browser IDN display policy: opinions sought

Paul Hoffman phoffman at imc.org
Sat Dec 10 18:26:36 CET 2011


First, Mark's correction (which needs to be checked) is an important one:

On Dec 9, 2011, at 3:12 AM, Gervase Markham wrote:

> The policies fall into 3 approximate buckets:
> 
> A (IE, Chrome): Unicode if the (single) 'language' of the string is
> configured in the options, Punycode otherwise.
> 
> B (Firefox, Opera): Unicode if the TLD is in a whitelist, Punycode
> otherwise. Arbitrary script mixing permitted (registry policy used to
> prevent abuse).
> 
> C (Safari): Unicode if the script is in a whitelist (which by default
> does not include Cyrillic or Greek), Punycode otherwise. Not sure about
> script mixing.

Later, Mark Davis said:

On Dec 9, 2011, at 10:10 AM, Mark Davis ☕ wrote:

> I'm not familiar with the code, but I think that (A) may actually be:
> 
> A (IE, Chrome): Unicode if the (single) 'script' of the string matches one of the scripts of the user's language(s) in the options,
> Punycode otherwise.
> 
> It is pretty easy and reliable to detect the script of the string, whereas language detection would be unreliable.

What a few people might be asking for is:

D: Unicode if the label is a single script that is displayable by the browser, Punycode otherwise.

Restated less tersely:

D: If every character in the label comes from a single script as defined in the Unicode Standard, and every character is displayable by the browser without resorting to "unknown" or "fallback" glyphs, display the label; otherwise show Punycode.

This would lead to zone owners having more assurance of their zones being displayed properly as long as every label is single-script. It requires no options-setting on the part of the user, which is a big win over (A) for users who are multi-lingual, and completely avoids the "TLDs we like" problem of B.

Thoughts?

--Paul Hoffman


More information about the Idna-update mailing list