Browser IDN display policy: opinions sought

Sun Dec 11 02:02:49 CET 2011

On Dec 10, 2011, at 4:15 PM, James Seng wrote:

> Many languages uses more than one script in their written system. Even Chinese which most people think is merely CJK Unified Ideograph would use ASCII and sometimes others like Bopomofo. 

Of course. Greek uses Latin digits. And so on.

We sang this song about a decade ago. Nothing has changed.

> Instead of trying to say what language would use a script sets and therefore display it as U-label, why not the other way round? We know latin/cyrillic combination would be a problem. We know there would be other combination of scripts would be a problem. We make combination of those scripts and display them in Punycode UNLESS the "language" of the string is configured in the options.

And we sang that one too. What "language" is a string that is two Han characters and some Latin digits?

> I think we need a combination of auto-detect problem U-labels and a whitelist.

We don't need anything: the rest of the users do. And it needs to come from a stable, trusted body. I question whether such a body who is willing to make a table exists.

--Paul Hoffman