referencing IDNA2008 (and IDNA2003?)

Sat Oct 23 02:35:29 CEST 2010

At 23:00 22/10/2010, Adam Barth wrote:
>On Fri, Oct 22, 2010 at 1:46 PM, J-F C. Morfin <jfc at morfin.org> wrote:
> > At 20:38 22/10/2010, Adam Barth wrote:
> > I'm not sure I quite understood everything that was mentioned during
> > this thread.  Concretely, here's what cookies need to do:
> >
> > To make sure we all understand well I think it would be nice to be sure we
> > talk of the same things in documenting:

Dear Adam,

As I explained IDNA2008 has built a separation between the Internet 
standards as documented by the IETF (RFC 5890 to 5894) and the 
Internet Use standards as initiated by "unusual" RFC 5895 (*). We 
(small IU Community kernel) need words to describe the result.

We have get accustomed to call this separation "Iron Curtain" for 
many reason. And to call "rope" the network wire plus its "off the 
rope" extensions outside of the "Iron Curtain" through what we call 
the IUI (Internet Use Interface) We also dubb the IUI as the "no-mans 
land" because that part is for extended network use processes, not 
for users or user agents and because a proposition is about "smart 
mines" protection in that area.

We found the wording better than "iron wall" initially proposed by 
analogy to "fire wall", and well received by people having to 
understand the IDNA2008 architectural implications

(*) This is well expressed in this key RFC 5895 part: "It should be 
noted that this document does not specify the behavior of a protocol 
that appears "on the wire". It describes an operation that is to be 
applied to user input in order to prepare that user input for use in 
an "on the network" protocol. As _unusual_ as this may be for a 
document concerning Internet protocols, it is necessary to describe 
this operation for implementors who may have designed around the 
original IDNA protocol (herein referred to as IDNA2003), which 
conflates this user-input operation into the protocol.

> > - who is "we"
>When I saw "we", I mean the httpstate working group.

Got it. I thought you were talking about the processes having to deal 
with the protocol.

> > - what does mean "receive from the network"
>More precisely, "receive as part of an HTTP response"

Through an OPES or not (RFC 4236) ? One way to implement the IUI is 
to used OPES or ONES (networked OPES).

> > - what is the user agent
>The user agent is a concept from HTTP.  It refers to a class of HTTP
>clients, namely those with users (as opposed to network
>intermediaries).

I understand. But there are many possibilities to implement IDNA2008 
on the user side. They may use their own DNS-agents or OPES-agents, 
on an user individual basis. Or networked.

> > - where do we have the URL from ? (in reference for example to the RFC 5895
> > special case).
>
>That's inessential.  Sometime in the past, the user agent generated an
>HTTP request for a particular URL.

You implied that this URL was not in lowercase A-label form. Now or 
in the past it is essential to know where did the system get it, to 
understand its nature and format.

> > - X is a lowercase A-label
>
>Not necessarily.  X is a sequence of octets.

OK, then you have changed your schema. If X is a sequence of octets 
and Y is of unknown format, I have some difficulty to understand how 
IDNA2008 can help.

> > Suggestion: to be sure you cannot be mistaken, do not consider U-Label as
> > Unicoded Labels you could match one way or another, but as 
> User-Labels which
> > can be many different things ranging from ASCII label to a 
> converted snap of
> > user fingers as Portzamparc hinted it. You will _never_ know which UTS 36,
> > RFC 5895, etc. etc. process has been used to convert U-labels 
> into lowercase
> > A-labels. Just remember that at Project.FRA we do not know yet (we want to
> > experiment first with people from many other languages) how we will support
> > plain French language "U-labels" as far as majuscules are concerned (same
> > for other Latin languages). Being a majuscule as impact on the 
> words meaning
> > and this is one of the metadata information Unicode does not support.
>
>This paragraph was entirely mysterious to me.

It may not be obvious if you do not dissociate Internet and Internet 
Use. Try to compare with telephone. Telephone works for people from 
every culture and languages. Yet if you want to build a phone service 
to the users you need to know more than sequences of octets and 
probably one of their 25.000 languages. Here it is the same. IDNA2008 
has relieved the Internet side of the Iron Curtain from the worry of 
languages, scripts and languages+scripts (orthotypography).

You will not be interoperable/compatible with the Internet IDNA2008 
if you reintroduce them.

> > I understand that you want to "specify the syntax and semantics
> > of these headers as they are actually used _on_ the Internet." IMHO
> > you follow the same thinking as IDNA2003, while IDNA2008 should make you
> > say "_through_ the Internet". You want to document "on the
> > rope" something now documented as being "off the rope"
> > (still networked but controlled by the user). You meet the same problem
> > as the WG/IDNABIS met. A problem of IS/MUST/SHOULD/MIGHT that demands
> > that you clarify the scope of the communication process you want to
> > stabilize to understand where you can use MUST, SHOULD and MIGHT - and if
> > IS/ARE have to be considered that are imposed on you.
>
>This paragraph is also entirely mysterious to me.  Where did the 
>rope come from?

I introduced the origin of the word rope. The difference from the 
wire is that it is optional, depending from the user side 
architecture being chosen or planned (at this time, 
investigated/tested). Some users may want to get IDNA implemented in 
the way the IETF imaginated it: a straight wire is OK between 
servers/user agents.
Others may want something more secure with a core DNS agent, 
supporting HTTP or other protocols user agents.
Others may want more related services.

> > IMHO to be of universal good value cookies should only be Internet
> > documented semi-stable data containers (therefore relying only on inner
> > Internet controlled elements such as A-labels).
>
>We're not taking a position about whether something is or is not a
>universal good.  We're taking a position as to what is useful for
>folks who would like to build HTTP user agents.

Your cookies are not limited to HTTP. As such their universal value 
can be good if they bring an open plus to the network architecture, 
and bad if they limit what could be done.

> > You cannot know all the different ways people will use it. But you must
> > tell them that if they want to use it in your way, they MUST follow that
> > you say. And, because you write a MUST you MUST be in control of the
> > feasibility of what you demand. On an "end to end" basis you
> > only control the "inner Internet" (within the Iron Curtain) DNS
> > entry/output, i.e. the lowercase A-label. And you can also trust it: they
> > are the xn-- labels Registries have actually registered, under IDNA2003
> > and IDNA2008.
>I have very little power to enforce my will upon others.  I believe
>the Iron Curtain was lifted near the end of the cold war, so I'm not
>overly concerned with what network infrastructure they might have used
>that far in the past.

MUST, SHOULD etc. have a meaning in the Internet standard process. In 
the new reading of the Internet architecture IDNA2008 has 
exemplified, it turns out that the IETF area seems to be where IETF 
can say MUST, and that in the Internet Use area the IETF can at most 
say SHOULD. This is a good way to know where what you consider may stand.

> > By nature a robust architecture is synergetic. This is synergetics.
>
>These statement are mysterious to me.

Richard Buckminster Fuller. RFC 3439 establishes that the internet 
has to be simple. Synergetics as a scientific discipline has somewhat 
helped understanding simplicity the way thevery large  Internet 
system needs it.

> > Seen that way, your cookies makes things simple and the whole net simpler
> > and more robust (RFC 3439) and inscrease the robustness of the whole
> > internet.
>
>I'm glad you hold that opinion.  From my perspective, cookies are
>neither simple nor to the make the whole Internet more robust.
>However, they're what we've got.

Dont get confused about what I said. I did not say cookies are 
simple, but that your cookies (i.e. the definition of them you try to 
build) should help making things simple and the net simpler. They 
could compare with the nano buckies (again from Bucky Fuller). At 
least this is the way we consider them and explore them - as remote 
datacontainers, some of us say "information bubbles".

I just tried to help from my own exploration of similar issues in 
multilinguisation context. I think now I am done. I will certainly 
spend time on your draft as it is a construed document capitalizing 
on the input of a WG.
jfc

>Adam
>
>
> > 1) We receive a sequence of octets from the network, which we convert
> > to lower case.  Let's call this X.
> > 2) We have a URL that the user agent has used to generate an HTTP
> > request.  Let's call the host name component of this URL Y.
> > 3) X is a sequence of octets that's has all the crazy xn-- stuff.
> > 4) We need to transform Y to the crazy xn-- form to see if it's in
> > some relation to X.
> > 5) For our sanity, we'd like to use octet-by-octet comparison, without
> > reference to any kind of folding (e.g., case-folding, IDNA-folding,
> > etc).
> >
> > As an added constraint, we don't feel its our place to mandate to user
> > agents whether they ought to use IDNA2003 or IDNA2008.  User agents
> > are free to make the decision independently of what the cookie spec
> > says.  You should feel free to lobby them one way or another, but
> > we're not going to impose that requirement on them in this document.
> >
> > My understanding is that the text in the spec meets our requirements.
> > If that's not the case, please let me know.
> >
> > Adam
> >
> >
> > On Fri, Oct 22, 2010 at 7:48 AM, Andrew Sullivan <ajs at shinkuro.com> wrote:
> >> On Fri, Oct 22, 2010 at 10:29:19AM -0400, Vint Cerf wrote:
> >>> andrew,
> >>>
> >>> we were pretty explicit that the algorithm that produces A-labels
> >>> produces only lower case. check with John Klensin.
> >>
> >> I remember talking about it, and I remember this being an issue
> >> because Punycode does not actually require lower case.  But I can't
> >> put my fingers on the text where it says this right now.  I haven't
> >> looked that hard, however.
> >>
> >> The reason I haven't looked hard is that it doesn't matter.  There is
> >> absolutely no way we can enforce any restriction in the DNS that
> >> requires the label to remain lower case.  Though DNS is supposed to be
> >> ASCII-case-preserving but ASCII-case-insensitive, the plain fact is
> >> that not every implementation does this, or does it correctly.  (I
> >> recall quite clearly pointing this out during the WG discussions,
> >> because some implementations use compression pointers to the original
> >> query string and therefore get whatever was asked by an application.)
> >> Applications can put their LDH queries in _in any case at all_ and
> >> have them work.  An IDNA2008-unaware stack with an IDNA2008-aware
> >> application above might do anything, including converting everything
> >> in the label to upper case (try logging into an old-fashioned UNIX
> >> console with the caps lock on.  You'll do a lookup for XN--SOMETHING
> >> no matter what you intend).  If it does that, and happens to query
> >> through a caching name server, the upper case form will persist in the
> >> cache.  You still have to treat that label as matching the lower case
> >> U-label.  We couldn't do all this above the DNS if you didn't have to.
> >>
> >> The case-preserving, case-insensitive feature of DNS was, in my
> >> opinion, a grave error.  But it's an error we have to live with
> >> forever if we're going to continue using DNS.  You simply cannot build
> >> a case-sensitive layer atop the DNS if any of the US-ASCII code points
> >> in the labels you want to use are themselves to be case sensitive.  If
> >> you want to do something clever like Punycode, only for the entire
> >> Unicode range (i.e. including that which overlaps with US-ASCII) so
> >> that you never have a transparent map between the DNS name and the
> >> user-presented name, then you have the possibility of introducing case
> >> sensitivity to the naming system (but not the DNS).  Otherwise, you're
> >> out of luck.
> >>
> >> A
> >>
> >>
> >> --
> >> Andrew Sullivan
> >> ajs at shinkuro.com
> >> Shinkuro, Inc.
> >>
> > _______________________________________________
> > Idna-update mailing list
> > Idna-update at alvestrand.no
> > http://www.alvestrand.no/mailman/listinfo/idna-update