mappings-01 and the general procedure

Vint Cerf vint at google.com
Mon Jul 13 19:02:09 CEST 2009


1. i think we need to retain the mapping document in some form and
I believe there is general consensus on that. Details are still in play
including the order of mappings and how the document is referenced
in other base documents of IDNA2008.

2. I think we need to be very careful about terms like "over the wire"
since such characters may arrive over the wire in many different
contexts. For purposes of DNS, I think "over the wire" is relevant
primarily to the lookup process. There can be many other contexts
in which things that may eventually be looked up may arrive and
it isn't clear to me that arrival over the wire or as a consequence
of local manipulation is dispositive.

3. I think mappings can at best be described as part of a process
of preparing strings for eventual lookup. The base documents of
IDNA2008 characterize what characters the registration and
lookup algorithm will allow to be arguments in the lookup or
registration process. As has been point out in many ways, a number
of transformations may occur to strings that will eventually be
looked up or registered, some of which involve mapping ( for the
look up case). We are in no position to describe every possible
transformation. We can say something about some (recommended)
mappings that we believe useful to make prior to lookup.

4. If I have understood the consequences of the proposed
mappings and PVALID test procedures, the superscript case
would NOT be mapped but would be treated as not PVALID.

Vint


On Jul 13, 2009, at 10:49 AM, Erik van der Poel wrote:

> Yes, that makes sense.
>
> I am also waiting for Vint to say something about reaching consensus
> on whether or not to keep the mappings document.
>
> My own opinion is that there still seems to be a significant
> disconnect between those that would prefer to keep the mappings
> confined to "user input", and those that would map even strings that
> have arrived "over the wire". While I have some sympathy for those
> that strive to create "clean" protocols (without leniently mapping
> strings that have arrived "over the wire"), I am also well aware of
> the history of e.g. browsers, which have traditionally been lenient
> and rarely become stricter. In other words, I consider the current
> approach of IDNAbis to be "wishful thinking" and not likely to
> actually happen.
>
> Of course, I'd love to be wrong about this and see the browser
> developers get strict for the currently small percentage of domain
> names that are non-ASCII *and* need to be mapped. (That percentage is
> small -- most non-ASCII domain names on the Web are already U-labels.)
>
> Currently, you can input a superscript two (U+00B2) in HTML and the
> browser will "helpfully" and inexplicably convert that to a normal two
> (U+0032) before looking it up in DNS. Why on Earth do we need to be
> that lenient?
>
> We shall see what the browser developers decide to do. I will be
> watching them carefully and then making my own recommendations within
> Google.
>
> Erik
>
> On Sun, Jul 12, 2009 at 5:02 PM, John C Klensin<klensin at jck.com>  
> wrote:
>>
>>
>> --On Sunday, July 12, 2009 16:47 -0700 Erik van der Poel
>> <erikv at google.com> wrote:
>>
>>> I think it would be a good idea to move the normalization and
>>> NFC text from section 5.2 of the protocol draft to section 5.3.
>>>
>>> The order of the steps in the mappings draft should probably be
>>> lower-casing, wide/narrow mapping, then NFC. These steps are
>>> performed in this order in IDNA2003 too.
>>
>> I have, temporarily, done that move in a different way, but it
>> is clear that, at the end of Section 5.3, the string has to
>> [still] be in NFC form.  If and when we decide that we are
>> keeping the mapping document, I want to restructure 5.2 and 5.3
>> into one section and refer to Mapping for, e.g., the discussion
>> about getting to Unicode.  IMO, that discussion in Mapping is
>> better (and more comprehensive) that what is now in Section 5.2.
>>
>>
>> So, if we keep Mapping, then the combined 5.2 and 5.3 would
>> basically say:
>>
>>        * Get the string from whatever form you find it in to
>>        Unicode, following the advice in [Mapping]
>>
>>        * That string MUST be in NFC form.
>>
>> Does that work for everyone, again, given that we don't discard
>> the Mapping doc entirely?
>>
>> If we do drop the Mapping document (I'm waiting for Vint to say
>> something about when he thinks consensus has been reached even
>> though I thought the San Francisco output was fairly clear), I
>> hope to enlist help from Paul and Pete to get the relevant
>> explanations out of Mapping and put them in 5.2 and 5.3 of
>> Protocol.   Again, does that make sense?
>>
>>    john
>>
>>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update



More information about the Idna-update mailing list