Standardizing on IDNA 2003 in the URL Standard

Anne van Kesteren annevk at annevk.nl
Wed Jan 15 17:26:22 CET 2014


On Sat, Aug 24, 2013 at 1:40 PM, Mark Davis ☕ <mark at macchiato.com> wrote:
> I put out some strawman ideas on this list, but clearly there needs to be
> more discussion. I think everyone recognizes that we won't get to zero
> "breaking" IDNA2003 URLs; the goal should be to get to a small enough number
> that the major players feel comfortable flipping the switch on the remaining
> ones.
>
> Back on Sept 9.

It's been a couple of months. Any updates for us?

Thinks I found not addressed by IDNA2003 that
http://url.spec.whatwg.org/#concept-host-parser papers over:

* Percent-decoding
* Rejecting certain ASCII code points to ensure idempotency, but not
e.g. "_" as that would break sites
* Lowercasing the ASCII code points as IDNA2003 only applies if
there's non-ASCII code point

I have not checked what of that can be removed if we use UTS #46
instead. Certainly referencing IDNA2008 directly does not work, as
"A.com" does not become "a.com", which would presumably break too many
scripts.


-- 
http://annevankesteren.nl/


More information about the Idna-update mailing list