Comments on draft-ietf-idnabis-defs-10

Tue Sep 1 17:41:44 CEST 2009

On Tue, Sep 01, 2009 at 10:36:53AM -0400, Andrew Sullivan wrote:
> 
> Hrm, so actually it is possible that all ASCII characters in an
> A-label are upper case, even though the input from the U-label is not
> allowed to have upper case in them.

Wait, that's not quite right either, I think.  If I understand you
correctly,

    xn--Bcher-kva

is not an A-label because Bücher is not a U-label.  But

    xn--bcher-KVA
and
    xn--bcher-kva

both are U-labels, and are both the valid output of allowed Punycode
implementations, resulting from encoding the valid U-label bücher?  Is
that right?  I'm not sure I'm convinced.

The passage from 3492 you quoted is part of this:

   A decoder MUST recognize the letters in both uppercase and lowercase
   forms (including mixtures of both forms).  An encoder SHOULD output
   only uppercase forms or only lowercase forms, unless it uses mixed-
   case annotation (see appendix A).

Now, the mixed-case annotation in Appx. A is already ruled out,
because valid U-labels can't have the upper case characters in
question.  Similarly, XN--BCHER-KVA isn't a valid A-label, for the
same reason xn--Bcher-kva isn't valid.

This leaves us only with the case of XN--KGBECHTV.  But an
implementation that turned that out would presumably not generate
valid A-labels when there was at least one ASCII character in the
U-label, because of the reasoning above.  So it's already not an
IDNA2008-usable Punycode implementation.  I think.

I appreciate that sending all this mail to the list is increasing the
traffic, and normally I am loathe to send such ill-formed thoughts to
the list, but it seems really important to me that, given the late
stage, we work out exactly the implications of all this.  (I already
think we need to have another WGLC, however, the final documents
change, because of this significant issue.)

Best,

A

-- 
Andrew Sullivan
ajs at shinkuro.com
Shinkuro, Inc.