Standardizing on IDNA 2003 in the URL Standard

Anne van Kesteren annevk at
Thu Aug 22 13:11:55 CEST 2013

On Thu, Aug 22, 2013 at 12:02 PM, Gervase Markham <gerv at> wrote:
> It's not been possible to register names like ☺☺☺.com for some time now;
> that's a big clue.

I don't think it is. There's sites out that rely on underscore working
in subdomains. You cannot register a domain name with an underscore.

> (Are your friends really using ?)

Yeah (with "example" replaced). Renders fine in Safari, too.

> IIRC, we must have broken a load of URLs when we decided that %-encoding
> in URLs should always be interpreted as UTF-8 (in RFC 3986), whereas
> beforehand it depended on the charset of the page or form producing the
> link. Why did we do that? Because the new way was better for the future,
> and some breakage was acceptable to attain that goal.

Actually, I don't think we did. And the reason for that is that the
non-ASCII usage was primarily in the query string. And as it happens,
we still use the character encoding to go from code points to
percent-escaped byte code points there. The IETF STD doesn't admit to
this, which is part of the reason why we have now.


More information about the Idna-update mailing list