Comments on draft-ietf-idnabis-defs-10
Vint Cerf
vint at google.com
Mon Aug 31 08:15:04 CEST 2009
Paul,
thanks for this -
Based on a few tests I tried with punycode and inverse punycode, it
looks to me as if lowercasing might be preferable.
My reasoning is that IDNs containing upper case characters product
punycode that, when inverted, yields lower case Unicode.
consequently it would appear as if only lowercased Unicode will
convert identically to/from punycode.
allowing uppercase characters in the punycode and having them
considered valid (and converting back to upper case Unicode) might
create much more confusion
especially considering that re-conversion of the upper-cased unicode
would yield a lowercased punycode string again and subsequently
lowercased unicode.
I hope I haven't confused things more.
Vint
On Aug 30, 2009, at 8:24 PM, Paul Hoffman wrote:
> At 2:32 PM -0400 8/30/09, John C Klensin wrote:
>> Suggested fix? Do I need to put a "force lower case for
>> undecorated Latin (ASCII) characters" into the conversation step
>> from A-labels to U-labels?
>
> Lower-casing the A-label before conversion is one possibility. The
> other is to do case-preserving comparisons when comparing A-labels
> for equivalence. I think the latter is a smaller change at this late
> date and also more robust. If you choose case-preserving
> comparisons, I think the following changes are sufficient.
>
> idnabis-defs 2.3.2.4:
>
> In IDNA, equivalence of labels is defined in terms of the A-labels.
> If the A-labels are equal in a case-independent comparison, then the
> s/case-independent/case-preserving/
> labels are considered equivalent, no matter how they are
> represented.
> Because of the isomorphism of A-labels and U-labels in IDNA2008, it
> is possible to compare U-labels directly; see [IDNA2008-Protocol]
> for
> details. Traditional LDH labels already have a notion of
> equivalence: within that list of characters, upper case and lower
> case are considered equivalent. The IDNA notion of equivalence is
> an
> Remove the whole sentence "Traditional LDH ... considered equivalent."
> extension of that older notion but, because there is no mapping, the
> only equivalents are:
>
> o Exact (bit-string identity) matches between a pair of U-labels.
>
> o Matches between a pair of A-labels, using normal DNS matching
> rules.
> s/normal DNS/exact character/
>
> o Equivalence between a U-label and an A-label determined by
> translating the U-label form into an A-label form and then
> testing
> for an exact match between the A-labels.
>
> . . .
>
> idnabis-defs 4.4:
> . . .
> [IDNA2008-Protocol]. For labels already in ASCII form, the proper
> comparison reduces to the same case-insensitive ASCII comparison
> that
> has always been used for ASCII labels although IDNA-aware
> applications are expected to look up only A-labels and NR-LDH-
> labels,
> i.e., to avoid looking up R-LDH-labels that are not A-labels.
> . . .
> Change "reduces to...ASCII labels although" to "(case-preserving),
> although...".
>
> idnabis-protocol 3.1:
> . . .
> A pair of
> A-labels MUST be compared as case-insensitive ASCII (as with all
> comparisons of ASCII DNS labels).
> Change to:
> A pair of
> A-labels MUST be compared using a case-preserving comparison.
>
> I think the WG owes a hearty thanks to Wil for finding this bug in
> the spec and for being persistent in pointing it out after we forgot
> his earlier postings.
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
More information about the Idna-update
mailing list