<font face="georgia,serif">I'm in agreement about the usefulness of storing the punycode form. As to what you would like to see, Patrik, I'm in agreement there as well; that the goal is IDNA2008. And I think we'll get there eventually, when the major registries disallow the registrations of non-IDNA2008 names. (Remembering that "registries" includes many orders of magnitudes more than just TLD registries.) </font><span class="Apple-style-span" style="font-family: georgia, serif; ">It works fine in specs to indicate that support of non-A-Label <meta charset="utf-8">punycode is considered a transitional strategy, as long as it is allowed.</span><div>

<meta charset="utf-8"><div><font face="georgia,serif"><br></font></div><div><font class="Apple-style-span" face="georgia, serif">Just to make clear, if <meta charset="utf-8"><span class="Apple-style-span" style="font-size: 13px; border-collapse: collapse; ">HTTP Cookies</span> has a general mechanism that is to work for clients right now and in the near term, it doesn't work to restrict to A-labels (that is, only those punycode labels that are also IDNA2008 compliant). For example, Chrome and the other browsers will need to store values for any of the domain names that they handle, which includes IDNA2003 domain names that they currently deal with. If you give them the choice of a spec that doesn't allow them to do what they need to do, then they will either have to be uncompliant with the spec, or use a different mechanism. Neither are particularly desirable.</font></div>

<div><font face="georgia,serif"><br></font></div><div><span class="Apple-style-span" style="font-family: georgia, serif; ">It is not a trivial matter in a world of connected software to make backwards-incompatible changes; these things take time to resolve. And IDNA2008 has only been out since August. Those of us on the implementation side have to deal with the transition in a graceful manner, otherwise it is we who get the customers complaining that features that used to work fine, now break. All of the effort involved in producing and maintaining UTS #46 was not undertaken for trivial reasons; browser venders, search engines, and others need to ensure that there is a smooth transition to IDNA2008.</span></div>

<div><font class="Apple-style-span" face="georgia, serif"><br></font></div><div><span class="Apple-style-span" style="font-family: georgia, serif; ">Mark</span></div><div><font face="georgia, serif"><br><i>— Il meglio è l’inimico del bene —</i></font><br>


<br><br><div class="gmail_quote">2010/10/23 Patrik Fältström <span dir="ltr"><<a href="mailto:patrik@frobbit.se">patrik@frobbit.se</a>></span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div class="im"><br>

On 22 okt 2010, at 21.35, John C Klensin wrote:<br>

<br>

> So, if either<br>

> the domain-attribute or the request-host contain non-ASCII<br>

> characters, it needs to convert those strings to A-labels<br>

> (IDNA2008) or via ToASCII (IDNA2003).<br>

<br>

</div>It is a little bit more complicated than this unfortunately. If what you might get as "input" (either X or Y) might be an IRI, there is a set of IRIs that the way I read the IRI spec might contain strings that are not IDNA-2008 compatible. I have lately started to believe that the only IRIs I would like to see in a context like yours are the ones that a) is in UTF-8 and b) fulfil the requirement that they can be transformed to a URI and back with a 1:1 mapping specified in the IRI spec.<br>


<br>

Now there is a new IRI draft out, and I have not checked the details in it, but I think we all would like to have:<br>

<br>

- IDNA2008 where there is a 1:1 mapping between A-label and U-label, and no mapping like IDNA-2003 (potential mapping _must_ really happen outside of whatever distributed comparison algorithm we are using)<br>

<br>

- IRIs and URIs that only contain domain names that are IDNA2008 compatible (U-label or A-label in the domain name part)<br>

<br>

If we start with that as base rules, then you can hopefully in your spec add additional "temporary rules" that might be recommended for backward compatibility reasons. But I think you should really call them that.<br>


<br>

If you have these rules, then you can -- modulo A-label/U-label transformation and URI/IRI transformation that both are 1:1 -- do much simpler comparison than what you otherwise can do if you have to start do transformation of Unicode strings (regardless of the encoding of the unicode string).<br>


<br>

What is important though is that you in the security consideration section explicitly note that there are many many many combination of octets that not only are invalid when these rules are applied, but if you are unlucky you might get buffer overflow issues (at best) when trying to do various things with the strings. Like do A-label/U-label transformation.<br>


<font color="#888888"><br>

   Patrik<br>

</font><div><div></div><div class="h5"><br>

_______________________________________________<br>

Idna-update mailing list<br>

<a href="mailto:Idna-update@alvestrand.no">Idna-update@alvestrand.no</a><br>

<a href="http://www.alvestrand.no/mailman/listinfo/idna-update" target="_blank">http://www.alvestrand.no/mailman/listinfo/idna-update</a><br>

</div></div></blockquote></div><br></div></div>