Standardizing on IDNA 2003 in the URL Standard

Fri Jan 17 16:56:51 CET 2014

On Fri, Jan 17, 2014 at 02:23:44PM +0100, Anne van Kesteren wrote:
> 
> What's important for interoperability in domain names is translation
> of a sequence of code points to a sequence of bytes that can be used
> within the DNS.

This is part of where we disagree.  What is important for
interoperability is not only what you say, but also a reversible
translation so that when you get the octets used in the DNS back, they
can always be turned back into the sequence of code points you started
with.  IDNA2003 doesn't have that property, which is the reason for
the backward incompatibility.

> If you take IDNA2003, an updated version of Unicode,
> and assume the same algorithms defined in IDNA2003 apply you have an
> algorithm that defines just that.

No, because there aren't algorithms defined in IDNA2003.  There's a
list of code points that are "out"; everything else is allowed.  We'd
actually have to go over the new code points in order to get the new
definition you're talking about.

To get the kind of algorithmic-based approach you're talking about
there, you have to move to IDNA2008.

> Then there's another aspect which is UI. Making sure the user is not
> spoofed, etc. 

Surely this is quite a different problem to the above, though, no?
(You're never going to be able to "make sure", of course.  All you can
do is get more or less right.)

Different user agents of course do these things differently.  I sort
of hate the approaches widely used, but I acknowledge that they're
better than nothing.

A

-- 
Andrew Sullivan
ajs at anvilwalrusden.com