referencing IDNA2008 (and IDNA2003?)

Adam Barth ietf at adambarth.com
Fri Oct 22 22:05:28 CEST 2010


On Fri, Oct 22, 2010 at 12:35 PM, John C Klensin <klensin at jck.com> wrote:
> --On Friday, October 22, 2010 12:17 -0700 Adam Barth
> <ietf at adambarth.com> wrote:
>>> Possible trap: is the domain-value provided by the HTTP
>>> server in the Set-Cookie header really an A-label (if the
>>> domain is an IDN)?
>>
>> I believe so.  Or rather, I think they'll need to be to work
>> properly today.  We could run some experiments to be sure, but
>> I was told that putting non-ASCII characters in HTTP headers
>> is bad news bears.
>
> That is certainly true for other reasons but, as I am regularly
> reminded, it doesn't mean that no one is doing it and that no
> HTTP server is letting them get away with it.   I think the
> world would be much better served in the long run --in terms of
> stability and predictable behavior-- if you could take the
> position that all cookie contents and communication about them
> took place using either traditional LDH (ASCII) forms or
> A-labels (note that means, e.g., no ASCII punctuation either).
> But whether it is plausible to do that at this stage is
> presumably something the WG will need to consider, including
> examining whether the "resolution other than via the public DNS"
> considerations discussed in draft-iab-i18n-encoding are relevant.

We're not talking about cookie contents.  We're talking about the
domain-attribute.  The document requires servers to emit
domain-attribute in the xn-- form or else the user agent will ignore
their cookies.

>> This text seems symmetric.  The user agent needs to know that
>> it should not apply the U-label => A-label conversion to the
>> domain-attribute but that it should apply the conversions to
>> the request-host.
>
> Not sure I understand.  If it is going to make a comparison,
> both strings have to be in the same label form.  So, if either
> the domain-attribute or the request-host contain non-ASCII
> characters, it needs to convert those strings to A-labels
> (IDNA2008) or via ToASCII (IDNA2003).

Maybe I misread Peter's text, but his text was symmetric w.r.t. X and
Y.  However, the behavior is not symmetric w.r.t. X and Y, so
something needs to break the symmetry.  The current document breaks
the symmetry by applying various IDNA algorithms to Y but not to X.

> Or, it can take a look at the strings, discover whether there
> are non-ASCII characters present, and, if they are, simply
> reject that putative label as bogus.  I have a personal
> preference (partially based on the "private encoding" theme of
> draft-iab-i18n-encoding and the small risk of a false positive)
> but, again, that is a decision the WG should make or leave to
> the UA.

No such sniffing is required.

>> That approach doesn't meet requirement (5).  In particular,
>> this text uses IDNA-folding comparisons instead of first
>> canonicalizing and then applying octet-by-octet comparisons.
>
> Yep.  And that is exactly the argument for forcing any
> cookie-related string into A-label form as early as possible and
> keeping it that way, rather than having IDNs as A-labels,
> U-labels, and assorted nonsense that is neither floating around.
> If you know everything is either an A-level, an LDH string, or
> an error, then life gets a whole lot easier.

I believe that's what the current document does.

On Fri, Oct 22, 2010 at 12:42 PM, John C Klensin <klensin at jck.com> wrote:
> --On Friday, October 22, 2010 12:26 -0700 Adam Barth
> <ietf at adambarth.com> wrote:
>> It doesn't matter if X (described below) is a valid label.  If
>> it's not an octet-by-octet match for something that we know is
>> well-formed, we just ignore the whole cookie.
>
> And that is why there is an explicit loophole in IDNA2008 that
> permits working with A-labels (or things that look like them)
> without the validation that Vint identifies.  This isn't an
> exact statement of the rule but the validation is basically
> required only if one expects to display the string or is doing
> some other processing that actually requires validity, not just,
> e.g., octet-by-octet string comparison.

We do not regard the value of the domain-attribute a sequence of
A-labels.  We simply regard it as a sequence of octets, which
simplifies our lives.

Adam


More information about the Idna-update mailing list