Comments on draft-ietf-idnabis-defs-10

John C Klensin klensin at jck.com
Mon Aug 31 07:47:36 CEST 2009



--On Sunday, August 30, 2009 17:24 -0700 Paul Hoffman
<phoffman at imc.org> wrote:

> At 2:32 PM -0400 8/30/09, John C Klensin wrote:
>> Suggested fix?  Do I need to put a "force lower case for
>> undecorated Latin (ASCII) characters" into the conversation
>> step from A-labels to U-labels?
> 
> Lower-casing the A-label before conversion is one possibility.
> The other is to do case-preserving comparisons when comparing
> A-labels for equivalence. I think the latter is a smaller
> change at this late date and also more robust. If you choose
> case-preserving comparisons, I think the following changes are
> sufficient.
>...

After looking at the text, one more possibility occurs to me.
Because upper-case ASCII characters (and all other unambiguously
upper-case characters) are DISALLOWED, Protocol Steps 4.4 and
5.5 cannot feed a string containing upper-case characters into
Punycode conversion.  As a result, upper-case characters cannot
appear in any ACE produced by either of those two steps.   The
same appears to be true of the ToASCII operation of IDNA2003 --
for characters with case distinctions, only lower-case
characters can go in and, consequently, only lower-case
characters can come out.

As a result, if a Punycode-compatible ACE-style label contains
upper case characters, it is because someone converted part of
the label to upper case or otherwise fabricated it, not because
it came out of IDNA2008 ACE conversion or IDNA2003 ToASCII.

As has been pointed out multiple times, the DNS does not care
and we cannot get it to perform a case-sensitive comparison no
matter what we do.  

So, I believe, the third option is simply to prohibit the
appearance or use, in comparison operations or otherwise, of
A-labels containing upper-case characters.  As with many other
bits of protocols, such labels will work anyway in some
contexts.  But, if we prohibit them, we avoid adding new
operations to the specifications, avoid a situation in which we
require case-insensitive comparisons but the DNS still compares
case-insensitively, and so on.

The net effect would be that, as far as the Definitions document
is concerned, an ACE string starting with xn--, but containing
upper case characters, would be just one more type of invalid
R-LDH label.


> I think the WG owes a hearty thanks to Wil for finding this
> bug in the spec and for being persistent in pointing it out
> after we forgot his earlier postings.

Agreed.

   john




More information about the Idna-update mailing list