Fwd: referencing IDNA2008 (and IDNA2003?)

Mark Davis ☕ mark at macchiato.com
Sun Oct 24 03:27:18 CEST 2010

The following was bounced from the idna-update at alvestrand.no, so trying


---------- Forwarded message ----------
From: Mark Davis ☕ <mark at macchiato.com>
Date: 2010/10/23
Subject: Re: referencing IDNA2008 (and IDNA2003?)
To: Patrik Fältström <patrik at frobbit.se>
Cc: John C Klensin <klensin at jck.com>, Andrew Sullivan <ajs at shinkuro.com>,
idna-update at alvestrand.no, jean-michel bernier de portzamparc <
jmabdp at gmail.com>, Peter Saint-Andre <stpeter at stpeter.im>, Vint Cerf <
vint at google.com>, Adam Barth <ietf at adambarth.com>,
Jeff.Hodges at kingsmountain.com, duerst at it.aoyama.ac.jp,
Marc.Blanchet at viagenie.ca

I'm in agreement about the usefulness of storing the punycode form. As to
what you would like to see, Patrik, I'm in agreement there as well; that the
goal is IDNA2008. And I think we'll get there eventually, when the major
registries disallow the registrations of non-IDNA2008 names. (Remembering
that "registries" includes many orders of magnitudes more than just TLD
registries.) It works fine in specs to indicate that support of
non-A-Label punycode is considered a transitional strategy, as long as it is

Just to make clear, if HTTP Cookies has a general mechanism that is to work
for clients right now and in the near term, it doesn't work to restrict to
A-labels (that is, only those punycode labels that are also IDNA2008
compliant). For example, Chrome and the other browsers will need to store
values for any of the domain names that they handle, which includes IDNA2003
domain names that they currently deal with. If you give them the choice of a
spec that doesn't allow them to do what they need to do, then they will
either have to be uncompliant with the spec, or use a different mechanism.
Neither are particularly desirable.

It is not a trivial matter in a world of connected software to make
backwards-incompatible changes; these things take time to resolve. And
IDNA2008 has only been out since August. Those of us on the implementation
side have to deal with the transition in a graceful manner, otherwise it is
we who get the customers complaining that features that used to work fine,
now break. All of the effort involved in producing and maintaining UTS #46
was not undertaken for trivial reasons; browser venders, search engines, and
others need to ensure that there is a smooth transition to IDNA2008.


*— Il meglio è l’inimico del bene —*

2010/10/23 Patrik Fältström <patrik at frobbit.se>

> On 22 okt 2010, at 21.35, John C Klensin wrote:
> > So, if either
> > the domain-attribute or the request-host contain non-ASCII
> > characters, it needs to convert those strings to A-labels
> > (IDNA2008) or via ToASCII (IDNA2003).
> It is a little bit more complicated than this unfortunately. If what you
> might get as "input" (either X or Y) might be an IRI, there is a set of IRIs
> that the way I read the IRI spec might contain strings that are not
> IDNA-2008 compatible. I have lately started to believe that the only IRIs I
> would like to see in a context like yours are the ones that a) is in UTF-8
> and b) fulfil the requirement that they can be transformed to a URI and back
> with a 1:1 mapping specified in the IRI spec.
> Now there is a new IRI draft out, and I have not checked the details in it,
> but I think we all would like to have:
> - IDNA2008 where there is a 1:1 mapping between A-label and U-label, and no
> mapping like IDNA-2003 (potential mapping _must_ really happen outside of
> whatever distributed comparison algorithm we are using)
> - IRIs and URIs that only contain domain names that are IDNA2008 compatible
> (U-label or A-label in the domain name part)
> If we start with that as base rules, then you can hopefully in your spec
> add additional "temporary rules" that might be recommended for backward
> compatibility reasons. But I think you should really call them that.
> If you have these rules, then you can -- modulo A-label/U-label
> transformation and URI/IRI transformation that both are 1:1 -- do much
> simpler comparison than what you otherwise can do if you have to start do
> transformation of Unicode strings (regardless of the encoding of the unicode
> string).
> What is important though is that you in the security consideration section
> explicitly note that there are many many many combination of octets that not
> only are invalid when these rules are applied, but if you are unlucky you
> might get buffer overflow issues (at best) when trying to do various things
> with the strings. Like do A-label/U-label transformation.
>   Patrik
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/idna-update/attachments/20101023/b709e447/attachment-0001.html>

More information about the Idna-update mailing list