Comments on draft-ietf-idnabis-defs-10

Wed Sep 2 19:21:02 CEST 2009

--On Thursday, September 03, 2009 01:39 +1000 Wil Tan
<wil at cloudregistry.net> wrote:

> Related note: in light of the discovery made by James that
> Punycode can in fact output uppercase characters to represent
> encoded non-ASCII codepoints, we could go back to
> idnabis-defs-10, 2.3.2.1 and further qualify "output of the
> Punycode algorithm". However, since practically all
> implementations output to lowercase, I suppose it is not
> necessary?

I'm reluctant to try to effectively modify Punycode by adjusting
Defs.  If this were really important enough, then it would be
important enough to change the charter (again) and update RFC
3492 to remove the upper-case output and case-sensitive options,
at least for IDNA purposes.  I don't think it is important
enough, but that is obvious just my opinion.  As I indicated in
an earlier note (which I think went to the whole list, but am
not sure), if we get this round finished and then, at the
appropriate time, move toward Draft Standard status for IDNA
(including Punycode), the "is this feature actually supported
and used" rule will take care of those features in a fairly
clean way.

>> Probably Rationale should be extended to discuss this issue
and
>> the reasons for the "require lowercase" statement.  I'd
>> welcome text on that subject and advice as to where to put
>> it, but will make something up if I don't hear from people.

> I'm lousy at writing such texts but do the follow bullets
> capture what you intend to say?
> 
> 1. Symmetry constraint between U-label and A-label is a
> desirable property and key design goal of IDNA2008
> 2. A-labels, being a subset of LDH-labels are sometimes stored
> and used without preserving case.
> 3. When that happens, we end up with having uppercase
> characters in the Punycode decoded result, which makes it an
> invalid U-label.
> 4. This happens because of the Punycode
> algorithm preserving the cases of the "basic code points" in
> the decoding process.
> 5. Because the Punycode encoding process (practically) never
> outputs uppercase characters from valid U-labels, we know that
> a valid A-label must not contain any uppercase character after
> the "xn--" ACE prefix.

Yes.  That works for me... unless others have other suggestions,
I'll try to turn it into text and get an interim version of
Rationale posted RSN.

    john