Esszett, Final Sigma, ZWJ and ZWNJ

Vint Cerf vint at google.com
Tue Feb 24 02:14:04 CET 2009


Mark,

thanks - I think what left me in an ambiguous state was the term "bits  
on the wire".  In your example, under the IDNA2003 mapping process,  
the final sigma is mapped into ordinary sigma and THEN the resulting  
string is looked up (after conversion to xn-- format using the  
punycode algorithm). The two forms become identical prior to lookup.  
Under the proposed IDNA2008 rules, the two strings remain distinct in  
both the U-label and A-label format and thus look "different" on the  
wire and unless other measures are taken (bundling, restricted  
registration, etc) it is possible for the two domains to yield  
distinct results on lookup.

Paul - is that the picture you wanted to paint?

sorry to be slow to see which bits you were comparing.

v


Vint Cerf
Google
1818 Library Street, Suite 400
Reston, VA 20190
202-370-5637
vint at google.com




On Feb 23, 2009, at 8:02 PM, Mark Davis wrote:

> I forgot to add:
>
> And I don't think changing the "xn--" helps any with this. If it is  
> already an XN-label, this is not a problem. The problem for domain  
> names stored in Unicode. They'll will be interpreted differently.
>
> Mark
>
>
> On Mon, Feb 23, 2009 at 16:59, Mark Davis <mark at macchiato.com> wrote:
> I tend to agree with Andrew that *effectively* this is a change to  
> "bits on the wire".
>
> That is, under IDNA2003 both "τιςγλώσσες.com" and  
> "τισγλώσσεσ.com", for example, go to the same location,  
> while under IDNA2008, they go to different locations (unless special  
> registry actions are taken that are outside the control of this  
> group).
>
> For example, in an HTML page posted on the web:
>
>    href="τιςγλώσσες.com"
>
> gets interpreted differently by an IDNA2003 browser than by an  
> IDNA2008 browser.
>
> Mark
>
>
>
> On Mon, Feb 23, 2009 at 12:17, Vint Cerf <vint at google.com> wrote:
> Paul,
>
> For clarity's sake, can we look at the "bits on the wire" issue for a
> moment?
>
> Would the addition of any new character to the allowed set constitute
> changing bits on the wire? There would be no bits on the wire for,
> e.g. an unassigned character, but bits would flow once the character
> is defined and is PVALID.
>
> In the case of sharp-S, under IDNA2003 the bits sent would have been
> "ss" but under IDNA2008,
> assuming sharp-S becomes a PVALID character, the bits sent would be
> the xn-- A-label form containing the ACE-encoded sharp-S character.
>
> Is it this change that captures your concern for "bits on the wire" or
> have I not understood the point?
>
> thanks
>
> vint
>
>
>
> Vint Cerf
> Google
> 1818 Library Street, Suite 400
> Reston, VA 20190
> 202-370-5637
> vint at google.com
>
>
>
>
> On Feb 23, 2009, at 2:45 PM, Paul Hoffman wrote:
>
> > At 1:48 PM -0500 2/23/09, John C Klensin wrote:
> >> I'm having trouble understanding your position (and Paul's).
> >> The charter rather specifically says that we can consider and
> >> make incompatible changes:
> >>
> >>      "Subject to the more general constraints described
> >>      above, the WG is permitted to consider changes that are
> >>      not strictly backwards-compatible.  For any such change
> >>      that is recommended, it is expected to document the
> >>      reasons for the change, the characters affected, and
> >>      possible transition strategies."
> >>
> >> Now, I suppose one could read "- Ensure practical stability of
> >> validity algorithms for IDNs." as one of those "general
> >> constraints" and prohibiting _any_ change that is not strictly
> >> backward-compatible.  But (i) that is explicitly an "additional
> >> goal", not a "general constraint".   And (ii) even if it were a
> >> "general constraint", reading it to prohibit this case would, I
> >> believe, require reading it to prohibit _any_ change that is not
> >> strictly backward-compatible with IDNA2003 and that would
> >> completely contradict the provision quoted above.
> >
> > Note the word "strictly" in "strictly backwards-compatible". Some of
> > us think, I think with justification, that it means there was room
> > for variance as long as other parts of the charter were not messed
> > with. Making some characters that were prohibited in IDNA2003 now
> > allowed clearly meets this requirement; so does prohibiting some
> > characters that were allowed in IDNA2003 now prohibited. Changing
> > the bits-on-the-wire representation of labels does not meet that,
> > particularly when combined with the prohibition on changing the
> > Punycode prefix.
> >
> > In any other IETF protocol, if you are going to change the bits on
> > the wire and you have an unambiguous method to flag that change,
> > such flagging would be required. I am having trouble understanding
> > why you think this protocol is special.
> > _______________________________________________
> > Idna-update mailing list
> > Idna-update at alvestrand.no
> > http://www.alvestrand.no/mailman/listinfo/idna-update
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090223/1af6e406/attachment-0001.htm 


More information about the Idna-update mailing list