Standardizing on IDNA 2003 in the URL Standard

Anne van Kesteren annevk at annevk.nl
Thu Jan 30 02:39:23 CET 2014


On Sat, Jan 18, 2014 at 9:04 AM, John C Klensin <klensin at jck.com> wrote:
> (2) You say "he cares about the definition of a `uint8_t*
> f(codepoint_t* >> input) { ... }` function and not user interface...".  Some of
> us just glaze over and wonder what on earth you think you are
> talking about.  Others react and say "Unless we care about users
> and user interfaces, there is absolutely no point in IDNs: as
> pure identifiers and components of other identifiers, the
> Internet (and other systems) can do perfectly well on ASCII
> identifiers restricted to what is commonly known as the LDH
> form.  In addition, if the issue is really an unambiguous
> function, one wants the dual of that function to work and be
> unambiguous too, and that means you have to prefer IDNA2008 over
> IDNA2003, so what are we arguing about."

So I care about compatibility with the deployed algorithm of going
from domain (code points) to DNS (bytes). That the deployed algorithm
is lossy is unfortunate, but at least for simple cases such as mapping
U+0041 to 0x61 I do not see that changing. That does indeed mean you
do not always get back what you entered. I think Björn and you
captured that correctly. Additionally, I pointed to
http://wiki.whatwg.org/wiki/URL#UI which shows that displaying domain
names is already more complicated than just doing the reverse
algorithm and you will therefore not always see what you entered.

However, I think I have been convinced by this thread that UTS #46
might be good enough as replacement for IDNA2003. Once it has been
clarified per the feedback I submitted I will incorporate it in the
URL Standard. It's unfortunate that even #46 is implemented in
different ways. :-(


-- 
http://annevankesteren.nl/


More information about the Idna-update mailing list