comments on last call drafts
"Martin J. Dürst"
duerst at it.aoyama.ac.jp
Tue Oct 27 10:15:36 CET 2009
On 2009/10/26 7:35, John C Klensin wrote:
> --On Tuesday, October 13, 2009 14:41 -0400 Dan Winship
> <dan.winship at gmail.com> wrote:
>> * 4.4 Punycode Conversion:
>> The failure conditions identified in the Punycode
>> encoding procedure cannot occur if the input is a
>> U-label as determined by the steps above.
>> But "the steps above" require running the Punycode encoding
>> procedure on the putative U-label to determine its length
>> when ACE encoded, so you won't know if it's a real U-label
>> until after running Punycode and possibly overflowing. So
>> if this sentence was meant to imply that you don't need to
>> check for overflow, then it's wrong (and if it's not meant
>> to imply that, then it's misleading.)
> Sigh. Back when we had a 63-octet limit on U-labels, overflow
> was not possible. We removed that limit and no one thought to
> check this.
Sorry, but if you think that it's impossible that a 63-octet limited
U-label produce an A-label with more than 63 octets, then that's wrong.
It's very simple to construct an example label:
The ü takes 2 octets, so overall, this is 63 octets in UTF-8. It is
definitely longer than 63 octets in punycode, because just only the
US-ASCII (61 characters/octets) and the xn-- prefix (4
characters/octets) add up to 65 octets, without yet having encoded the
"ü". Indeed playing around a bit at http://josefsson.org/idn.php/, the
longest U-label of this type that works is
(56 characters/57 bytes in UTF-8), producing
Good to see we discovered another false reason for the
63-octets-on-UTF-8 length limit, and therefore another reason for
abolishing said limit.
> Have restructured the sentence.
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp mailto:duerst at it.aoyama.ac.jp
More information about the Idna-update