Standardizing on IDNA 2003 in the URL Standard

Jungshik SHIN (신정식) jshin1987+w3 at
Fri Aug 23 07:59:20 CEST 2013

On Thu, Aug 22, 2013 at 4:11 AM, Anne van Kesteren <annevk at> wrote:

> On Thu, Aug 22, 2013 at 12:02 PM, Gervase Markham <gerv at>
> wrote:
> > It's not been possible to register names like ☺☺☺.com for some time now;
> > that's a big clue.
> I don't think it is. There's sites out that rely on underscore working
> in subdomains. You cannot register a domain name with an underscore.
> > (Are your friends really using ?)
> Yeah (with "example" replaced). Renders fine in Safari, too.
> > IIRC, we must have broken a load of URLs when we decided that %-encoding
> > in URLs should always be interpreted as UTF-8 (in RFC 3986), whereas
> > beforehand it depended on the charset of the page or form producing the
> > link. Why did we do that? Because the new way was better for the future,
> > and some breakage was acceptable to attain that goal.
> Actually, I don't think we did. And the reason for that is that the
> non-ASCII usage was primarily in the query string.

Well,  there are tons of urls whose path part have non-ASCII characters.
They're very common in Korea, for instance.

> And as it happens,
> we still use the character encoding to go from code points to
> percent-escaped byte code points there. The IETF STD doesn't admit to
> this, which is part of the reason why we have
> now.
> --
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Idna-update mailing list