referencing IDNA2008 (and IDNA2003?)

Adam Barth ietf at adambarth.com
Fri Oct 22 23:00:41 CEST 2010


On Fri, Oct 22, 2010 at 1:46 PM, J-F C. Morfin <jfc at morfin.org> wrote:
> At 20:38 22/10/2010, Adam Barth wrote:
> I'm not sure I quite understood everything that was mentioned during
> this thread.  Concretely, here's what cookies need to do:
>
> To make sure we all understand well I think it would be nice to be sure we
> talk of the same things in documenting:
>
> - who is "we"

When I saw "we", I mean the httpstate working group.

> - what does mean "receive from the network"

More precisely, "receive as part of an HTTP response"

> - what is the user agent

The user agent is a concept from HTTP.  It refers to a class of HTTP
clients, namely those with users (as opposed to network
intermediaries).

> - where do we have the URL from ? (in reference for example to the RFC 5895
> special case).

That's inessential.  Sometime in the past, the user agent generated an
HTTP request for a particular URL.

> - X is a lowercase A-label

Not necessarily.  X is a sequence of octets.

> Suggestion: to be sure you cannot be mistaken, do not consider U-Label as
> Unicoded Labels you could match one way or another, but as User-Labels which
> can be many different things ranging from ASCII label to a converted snap of
> user fingers as Portzamparc hinted it. You will _never_ know which UTS 36,
> RFC 5895, etc. etc. process has been used to convert U-labels into lowercase
> A-labels. Just remember that at Project.FRA we do not know yet (we want to
> experiment first with people from many other languages) how we will support
> plain French language "U-labels" as far as majuscules are concerned (same
> for other Latin languages). Being a majuscule as impact on the words meaning
> and this is one of the metadata information Unicode does not support.

This paragraph was entirely mysterious to me.

> I understand that you want to "specify the syntax and semantics
> of these headers as they are actually used _on_ the Internet." IMHO
> you follow the same thinking as IDNA2003, while IDNA2008 should make you
> say "_through_ the Internet". You want to document "on the
> rope" something now documented as being "off the rope"
> (still networked but controlled by the user). You meet the same problem
> as the WG/IDNABIS met. A problem of IS/MUST/SHOULD/MIGHT that demands
> that you clarify the scope of the communication process you want to
> stabilize to understand where you can use MUST, SHOULD and MIGHT - and if
> IS/ARE have to be considered that are imposed on you.

This paragraph is also entirely mysterious to me.  Where did the rope come from?

> IMHO to be off universal good value cookies should only be Internet
> documented semi-stable data containers (therefore relying only on inner
> Internet controlled elements such as A-labels).

We're not taking a position about whether something is or is not a
universal good.  We're taking a position as to what is useful for
folks who would like to build HTTP user agents.

> You cannot know all the different ways people will use it. But you must
> tell them that if they want to use it in your way, they MUST follow that
> you say. And, because you write a MUST you MUST be in control of the
> feasibility of what you demand. On an "end to end" basis you
> only control the "inner Internet" (within the Iron Curtain) DNS
> entry/output, i.e. the lowercase A-label. And you can also trust it: they
> are the xn-- labels Registries have actually registered, under IDNA2003
> and IDNA2008.

I have very little power to enforce my will upon others.  I believe
the Iron Curtain was lifted near the end of the cold war, so I'm not
overly concerned with what network infrastructure they might have used
that far in the past.

> By nature a robust architecture is synergetic. This is synergetics.

These statement are mysterious to me.

> Seen that way, your cookies makes things simple and the whole net simpler
> and more robust (RFC 3439) and inscrease the robustness of the whole
> internet.

I'm glad you hold that opinion.  From my perspective, cookies are
neither simple nor to the make the whole Internet more robust.
However, they're what we've got.

Adam


> 1) We receive a sequence of octets from the network, which we convert
> to lower case.  Let's call this X.
> 2) We have a URL that the user agent has used to generate an HTTP
> request.  Let's call the host name component of this URL Y.
> 3) X is a sequence of octets that's has all the crazy xn-- stuff.
> 4) We need to transform Y to the crazy xn-- form to see if it's in
> some relation to X.
> 5) For our sanity, we'd like to use octet-by-octet comparison, without
> reference to any kind of folding (e.g., case-folding, IDNA-folding,
> etc).
>
> As an added constraint, we don't feel its our place to mandate to user
> agents whether they ought to use IDNA2003 or IDNA2008.  User agents
> are free to make the decision independently of what the cookie spec
> says.  You should feel free to lobby them one way or another, but
> we're not going to impose that requirement on them in this document.
>
> My understanding is that the text in the spec meets our requirements.
> If that's not the case, please let me know.
>
> Adam
>
>
> On Fri, Oct 22, 2010 at 7:48 AM, Andrew Sullivan <ajs at shinkuro.com> wrote:
>> On Fri, Oct 22, 2010 at 10:29:19AM -0400, Vint Cerf wrote:
>>> andrew,
>>>
>>> we were pretty explicit that the algorithm that produces A-labels
>>> produces only lower case. check with John Klensin.
>>
>> I remember talking about it, and I remember this being an issue
>> because Punycode does not actually require lower case.  But I can't
>> put my fingers on the text where it says this right now.  I haven't
>> looked that hard, however.
>>
>> The reason I haven't looked hard is that it doesn't matter.  There is
>> absolutely no way we can enforce any restriction in the DNS that
>> requires the label to remain lower case.  Though DNS is supposed to be
>> ASCII-case-preserving but ASCII-case-insensitive, the plain fact is
>> that not every implementation does this, or does it correctly.  (I
>> recall quite clearly pointing this out during the WG discussions,
>> because some implementations use compression pointers to the original
>> query string and therefore get whatever was asked by an application.)
>> Applications can put their LDH queries in _in any case at all_ and
>> have them work.  An IDNA2008-unaware stack with an IDNA2008-aware
>> application above might do anything, including converting everything
>> in the label to upper case (try logging into an old-fashioned UNIX
>> console with the caps lock on.  You'll do a lookup for XN--SOMETHING
>> no matter what you intend).  If it does that, and happens to query
>> through a caching name server, the upper case form will persist in the
>> cache.  You still have to treat that label as matching the lower case
>> U-label.  We couldn't do all this above the DNS if you didn't have to.
>>
>> The case-preserving, case-insensitive feature of DNS was, in my
>> opinion, a grave error.  But it's an error we have to live with
>> forever if we're going to continue using DNS.  You simply cannot build
>> a case-sensitive layer atop the DNS if any of the US-ASCII code points
>> in the labels you want to use are themselves to be case sensitive.  If
>> you want to do something clever like Punycode, only for the entire
>> Unicode range (i.e. including that which overlaps with US-ASCII) so
>> that you never have a transparent map between the DNS name and the
>> user-presented name, then you have the possibility of introducing case
>> sensitivity to the naming system (but not the DNS).  Otherwise, you're
>> out of luck.
>>
>> A
>>
>>
>> --
>> Andrew Sullivan
>> ajs at shinkuro.com
>> Shinkuro, Inc.
>>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update


More information about the Idna-update mailing list