referencing IDNA2008 (and IDNA2003?)

Sat Oct 23 05:22:27 CEST 2010

I'm sad that you've invented your own language for describing things
that other folks call by other names.  That makes it difficult to
understand what you're saying.  However, it appears we are in
agreement.

Adam

On Fri, Oct 22, 2010 at 5:35 PM, J-F C. Morfin <jfc at morfin.org> wrote:
> At 23:00 22/10/2010, Adam Barth wrote:
>>
>> On Fri, Oct 22, 2010 at 1:46 PM, J-F C. Morfin <jfc at morfin.org> wrote:
>> > At 20:38 22/10/2010, Adam Barth wrote:
>> > I'm not sure I quite understood everything that was mentioned during
>> > this thread.  Concretely, here's what cookies need to do:
>> >
>> > To make sure we all understand well I think it would be nice to be sure
>> > we
>> > talk of the same things in documenting:
>
> Dear Adam,
>
> As I explained IDNA2008 has built a separation between the Internet
> standards as documented by the IETF (RFC 5890 to 5894) and the Internet Use
> standards as initiated by "unusual" RFC 5895 (*). We (small IU Community
> kernel) need words to describe the result.
>
> We have get accustomed to call this separation "Iron Curtain" for many
> reason. And to call "rope" the network wire plus its "off the rope"
> extensions outside of the "Iron Curtain" through what we call the IUI
> (Internet Use Interface) We also dubb the IUI as the "no-mans land" because
> that part is for extended network use processes, not for users or user
> agents and because a proposition is about "smart mines" protection in that
> area.
>
> We found the wording better than "iron wall" initially proposed by analogy
> to "fire wall", and well received by people having to understand the
> IDNA2008 architectural implications
>
> (*) This is well expressed in this key RFC 5895 part: "It should be noted
> that this document does not specify the behavior of a protocol that appears
> "on the wire". It describes an operation that is to be applied to user input
> in order to prepare that user input for use in an "on the network" protocol.
> As _unusual_ as this may be for a document concerning Internet protocols, it
> is necessary to describe this operation for implementors who may have
> designed around the original IDNA protocol (herein referred to as IDNA2003),
> which conflates this user-input operation into the protocol.
>
>> > - who is "we"
>> When I saw "we", I mean the httpstate working group.
>
> Got it. I thought you were talking about the processes having to deal with
> the protocol.
>
>> > - what does mean "receive from the network"
>> More precisely, "receive as part of an HTTP response"
>
> Through an OPES or not (RFC 4236) ? One way to implement the IUI is to used
> OPES or ONES (networked OPES).
>
>> > - what is the user agent
>> The user agent is a concept from HTTP.  It refers to a class of HTTP
>> clients, namely those with users (as opposed to network
>> intermediaries).
>
> I understand. But there are many possibilities to implement IDNA2008 on the
> user side. They may use their own DNS-agents or OPES-agents, on an user
> individual basis. Or networked.
>
>> > - where do we have the URL from ? (in reference for example to the RFC
>> > 5895
>> > special case).
>>
>> That's inessential.  Sometime in the past, the user agent generated an
>> HTTP request for a particular URL.
>
> You implied that this URL was not in lowercase A-label form. Now or in the
> past it is essential to know where did the system get it, to understand its
> nature and format.
>
>> > - X is a lowercase A-label
>>
>> Not necessarily.  X is a sequence of octets.
>
> OK, then you have changed your schema. If X is a sequence of octets and Y is
> of unknown format, I have some difficulty to understand how IDNA2008 can
> help.
>
>> > Suggestion: to be sure you cannot be mistaken, do not consider U-Label
>> > as
>> > Unicoded Labels you could match one way or another, but as User-Labels
>> > which
>> > can be many different things ranging from ASCII label to a converted
>> > snap of
>> > user fingers as Portzamparc hinted it. You will _never_ know which UTS
>> > 36,
>> > RFC 5895, etc. etc. process has been used to convert U-labels into
>> > lowercase
>> > A-labels. Just remember that at Project.FRA we do not know yet (we want
>> > to
>> > experiment first with people from many other languages) how we will
>> > support
>> > plain French language "U-labels" as far as majuscules are concerned
>> > (same
>> > for other Latin languages). Being a majuscule as impact on the words
>> > meaning
>> > and this is one of the metadata information Unicode does not support.
>>
>> This paragraph was entirely mysterious to me.
>
> It may not be obvious if you do not dissociate Internet and Internet Use.
> Try to compare with telephone. Telephone works for people from every culture
> and languages. Yet if you want to build a phone service to the users you
> need to know more than sequences of octets and probably one of their 25.000
> languages. Here it is the same. IDNA2008 has relieved the Internet side of
> the Iron Curtain from the worry of languages, scripts and languages+scripts
> (orthotypography).
>
> You will not be interoperable/compatible with the Internet IDNA2008 if you
> reintroduce them.
>
>> > I understand that you want to "specify the syntax and semantics
>> > of these headers as they are actually used _on_ the Internet." IMHO
>> > you follow the same thinking as IDNA2003, while IDNA2008 should make you
>> > say "_through_ the Internet". You want to document "on the
>> > rope" something now documented as being "off the rope"
>> > (still networked but controlled by the user). You meet the same problem
>> > as the WG/IDNABIS met. A problem of IS/MUST/SHOULD/MIGHT that demands
>> > that you clarify the scope of the communication process you want to
>> > stabilize to understand where you can use MUST, SHOULD and MIGHT - and
>> > if
>> > IS/ARE have to be considered that are imposed on you.
>>
>> This paragraph is also entirely mysterious to me.  Where did the rope come
>> from?
>
> I introduced the origin of the word rope. The difference from the wire is
> that it is optional, depending from the user side architecture being chosen
> or planned (at this time, investigated/tested). Some users may want to get
> IDNA implemented in the way the IETF imaginated it: a straight wire is OK
> between servers/user agents.
> Others may want something more secure with a core DNS agent, supporting HTTP
> or other protocols user agents.
> Others may want more related services.
>
>> > IMHO to be of universal good value cookies should only be Internet
>> > documented semi-stable data containers (therefore relying only on inner
>> > Internet controlled elements such as A-labels).
>>
>> We're not taking a position about whether something is or is not a
>> universal good.  We're taking a position as to what is useful for
>> folks who would like to build HTTP user agents.
>
> Your cookies are not limited to HTTP. As such their universal value can be
> good if they bring an open plus to the network architecture, and bad if they
> limit what could be done.
>
>> > You cannot know all the different ways people will use it. But you must
>> > tell them that if they want to use it in your way, they MUST follow that
>> > you say. And, because you write a MUST you MUST be in control of the
>> > feasibility of what you demand. On an "end to end" basis you
>> > only control the "inner Internet" (within the Iron Curtain) DNS
>> > entry/output, i.e. the lowercase A-label. And you can also trust it:
>> > they
>> > are the xn-- labels Registries have actually registered, under IDNA2003
>> > and IDNA2008.
>> I have very little power to enforce my will upon others.  I believe
>> the Iron Curtain was lifted near the end of the cold war, so I'm not
>> overly concerned with what network infrastructure they might have used
>> that far in the past.
>
> MUST, SHOULD etc. have a meaning in the Internet standard process. In the
> new reading of the Internet architecture IDNA2008 has exemplified, it turns
> out that the IETF area seems to be where IETF can say MUST, and that in the
> Internet Use area the IETF can at most say SHOULD. This is a good way to
> know where what you consider may stand.
>
>> > By nature a robust architecture is synergetic. This is synergetics.
>>
>> These statement are mysterious to me.
>
> Richard Buckminster Fuller. RFC 3439 establishes that the internet has to be
> simple. Synergetics as a scientific discipline has somewhat helped
> understanding simplicity the way thevery large  Internet system needs it.
>
>> > Seen that way, your cookies makes things simple and the whole net
>> > simpler
>> > and more robust (RFC 3439) and inscrease the robustness of the whole
>> > internet.
>>
>> I'm glad you hold that opinion.  From my perspective, cookies are
>> neither simple nor to the make the whole Internet more robust.
>> However, they're what we've got.
>
> Dont get confused about what I said. I did not say cookies are simple, but
> that your cookies (i.e. the definition of them you try to build) should help
> making things simple and the net simpler. They could compare with the nano
> buckies (again from Bucky Fuller). At least this is the way we consider them
> and explore them - as remote datacontainers, some of us say "information
> bubbles".
>
> I just tried to help from my own exploration of similar issues in
> multilinguisation context. I think now I am done. I will certainly spend
> time on your draft as it is a construed document capitalizing on the input
> of a WG.
> jfc
>
>
>> Adam
>>
>>
>> > 1) We receive a sequence of octets from the network, which we convert
>> > to lower case.  Let's call this X.
>> > 2) We have a URL that the user agent has used to generate an HTTP
>> > request.  Let's call the host name component of this URL Y.
>> > 3) X is a sequence of octets that's has all the crazy xn-- stuff.
>> > 4) We need to transform Y to the crazy xn-- form to see if it's in
>> > some relation to X.
>> > 5) For our sanity, we'd like to use octet-by-octet comparison, without
>> > reference to any kind of folding (e.g., case-folding, IDNA-folding,
>> > etc).
>> >
>> > As an added constraint, we don't feel its our place to mandate to user
>> > agents whether they ought to use IDNA2003 or IDNA2008.  User agents
>> > are free to make the decision independently of what the cookie spec
>> > says.  You should feel free to lobby them one way or another, but
>> > we're not going to impose that requirement on them in this document.
>> >
>> > My understanding is that the text in the spec meets our requirements.
>> > If that's not the case, please let me know.
>> >
>> > Adam
>> >
>> >
>> > On Fri, Oct 22, 2010 at 7:48 AM, Andrew Sullivan <ajs at shinkuro.com>
>> > wrote:
>> >> On Fri, Oct 22, 2010 at 10:29:19AM -0400, Vint Cerf wrote:
>> >>> andrew,
>> >>>
>> >>> we were pretty explicit that the algorithm that produces A-labels
>> >>> produces only lower case. check with John Klensin.
>> >>
>> >> I remember talking about it, and I remember this being an issue
>> >> because Punycode does not actually require lower case.  But I can't
>> >> put my fingers on the text where it says this right now.  I haven't
>> >> looked that hard, however.
>> >>
>> >> The reason I haven't looked hard is that it doesn't matter.  There is
>> >> absolutely no way we can enforce any restriction in the DNS that
>> >> requires the label to remain lower case.  Though DNS is supposed to be
>> >> ASCII-case-preserving but ASCII-case-insensitive, the plain fact is
>> >> that not every implementation does this, or does it correctly.  (I
>> >> recall quite clearly pointing this out during the WG discussions,
>> >> because some implementations use compression pointers to the original
>> >> query string and therefore get whatever was asked by an application.)
>> >> Applications can put their LDH queries in _in any case at all_ and
>> >> have them work.  An IDNA2008-unaware stack with an IDNA2008-aware
>> >> application above might do anything, including converting everything
>> >> in the label to upper case (try logging into an old-fashioned UNIX
>> >> console with the caps lock on.  You'll do a lookup for XN--SOMETHING
>> >> no matter what you intend).  If it does that, and happens to query
>> >> through a caching name server, the upper case form will persist in the
>> >> cache.  You still have to treat that label as matching the lower case
>> >> U-label.  We couldn't do all this above the DNS if you didn't have to.
>> >>
>> >> The case-preserving, case-insensitive feature of DNS was, in my
>> >> opinion, a grave error.  But it's an error we have to live with
>> >> forever if we're going to continue using DNS.  You simply cannot build
>> >> a case-sensitive layer atop the DNS if any of the US-ASCII code points
>> >> in the labels you want to use are themselves to be case sensitive.  If
>> >> you want to do something clever like Punycode, only for the entire
>> >> Unicode range (i.e. including that which overlaps with US-ASCII) so
>> >> that you never have a transparent map between the DNS name and the
>> >> user-presented name, then you have the possibility of introducing case
>> >> sensitivity to the naming system (but not the DNS).  Otherwise, you're
>> >> out of luck.
>> >>
>> >> A
>> >>
>> >>
>> >> --
>> >> Andrew Sullivan
>> >> ajs at shinkuro.com
>> >> Shinkuro, Inc.
>> >>
>> > _______________________________________________
>> > Idna-update mailing list
>> > Idna-update at alvestrand.no
>> > http://www.alvestrand.no/mailman/listinfo/idna-update
>
>