Comments on draft-ietf-idnabis-defs-10

Mon Aug 31 08:15:04 CEST 2009

Paul,

thanks for this -

Based on a few tests I tried with punycode and inverse punycode, it  
looks to me as if lowercasing might be preferable.

My reasoning is that IDNs containing upper case characters product  
punycode that, when inverted, yields lower case Unicode.

consequently it would appear as if only lowercased Unicode will  
convert identically to/from punycode.

allowing uppercase characters in the punycode and having them  
considered valid (and converting back to upper case Unicode) might  
create much more confusion
especially considering that re-conversion of the upper-cased unicode  
would yield a lowercased punycode string again and subsequently  
lowercased unicode.

I hope I haven't confused things more.

Vint

On Aug 30, 2009, at 8:24 PM, Paul Hoffman wrote:

> At 2:32 PM -0400 8/30/09, John C Klensin wrote:
>> Suggested fix?  Do I need to put a "force lower case for
>> undecorated Latin (ASCII) characters" into the conversation step
>> from A-labels to U-labels?
>
> Lower-casing the A-label before conversion is one possibility. The  
> other is to do case-preserving comparisons when comparing A-labels  
> for equivalence. I think the latter is a smaller change at this late  
> date and also more robust. If you choose case-preserving  
> comparisons, I think the following changes are sufficient.
>
> idnabis-defs 2.3.2.4:
>
>   In IDNA, equivalence of labels is defined in terms of the A-labels.
>   If the A-labels are equal in a case-independent comparison, then the
> s/case-independent/case-preserving/
>   labels are considered equivalent, no matter how they are  
> represented.
>   Because of the isomorphism of A-labels and U-labels in IDNA2008, it
>   is possible to compare U-labels directly; see [IDNA2008-Protocol]  
> for
>   details.  Traditional LDH labels already have a notion of
>   equivalence: within that list of characters, upper case and lower
>   case are considered equivalent.  The IDNA notion of equivalence is  
> an
> Remove the whole sentence "Traditional LDH ... considered equivalent."
>   extension of that older notion but, because there is no mapping, the
>   only equivalents are:
>
>   o  Exact (bit-string identity) matches between a pair of U-labels.
>
>   o  Matches between a pair of A-labels, using normal DNS matching
>      rules.
> s/normal DNS/exact character/
>
>   o  Equivalence between a U-label and an A-label determined by
>      translating the U-label form into an A-label form and then  
> testing
>      for an exact match between the A-labels.
>
> . . .
>
> idnabis-defs 4.4:
> . . .
>   [IDNA2008-Protocol].  For labels already in ASCII form, the proper
>   comparison reduces to the same case-insensitive ASCII comparison  
> that
>   has always been used for ASCII labels although IDNA-aware
>   applications are expected to look up only A-labels and NR-LDH- 
> labels,
>   i.e., to avoid looking up R-LDH-labels that are not A-labels.
> . . .
> Change "reduces to...ASCII labels although" to "(case-preserving),  
> although...".
>
> idnabis-protocol 3.1:
> . . .
>                                                        A pair of
>       A-labels MUST be compared as case-insensitive ASCII (as with all
>       comparisons of ASCII DNS labels).
> Change to:
>                                                        A pair of
>       A-labels MUST be compared using a case-preserving comparison.
>
> I think the WG owes a hearty thanks to Wil for finding this bug in  
> the spec and for being persistent in pointing it out after we forgot  
> his earlier postings.
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update