referencing IDNA2008 (and IDNA2003?)

John C Klensin klensin at jck.com
Fri Oct 22 21:35:40 CEST 2010



--On Friday, October 22, 2010 12:17 -0700 Adam Barth
<ietf at adambarth.com> wrote:

>> Possible trap: is the domain-value provided by the HTTP
>> server in the Set-Cookie header really an A-label (if the
>> domain is an IDN)?
> 
> I believe so.  Or rather, I think they'll need to be to work
> properly today.  We could run some experiments to be sure, but
> I was told that putting non-ASCII characters in HTTP headers
> is bad news bears.

That is certainly true for other reasons but, as I am regularly
reminded, it doesn't mean that no one is doing it and that no
HTTP server is letting them get away with it.   I think the
world would be much better served in the long run --in terms of
stability and predictable behavior-- if you could take the
position that all cookie contents and communication about them
took place using either traditional LDH (ASCII) forms or
A-labels (note that means, e.g., no ASCII punctuation either).
But whether it is plausible to do that at this stage is
presumably something the WG will need to consider, including
examining whether the "resolution other than via the public DNS"
considerations discussed in draft-iab-i18n-encoding are relevant.

>...
> This text seems symmetric.  The user agent needs to know that
> it should not apply the U-label => A-label conversion to the
> domain-attribute but that it should apply the conversions to
> the request-host.

Not sure I understand.  If it is going to make a comparison,
both strings have to be in the same label form.  So, if either
the domain-attribute or the request-host contain non-ASCII
characters, it needs to convert those strings to A-labels
(IDNA2008) or via ToASCII (IDNA2003).

Or, it can take a look at the strings, discover whether there
are non-ASCII characters present, and, if they are, simply
reject that putative label as bogus.  I have a personal
preference (partially based on the "private encoding" theme of
draft-iab-i18n-encoding and the small risk of a false positive)
but, again, that is a decision the WG should make or leave to
the UA.

>...
 
> That approach doesn't meet requirement (5).  In particular,
> this text uses IDNA-folding comparisons instead of first
> canonicalizing and then applying octet-by-octet comparisons.

Yep.  And that is exactly the argument for forcing any
cookie-related string into A-label form as early as possible and
keeping it that way, rather than having IDNs as A-labels,
U-labels, and assorted nonsense that is neither floating around.
If you know everything is either an A-level, an LDH string, or
an error, then life gets a whole lot easier.

   john




More information about the Idna-update mailing list