Unconditional punycode conversion

Simon Josefsson simon at josefsson.org
Wed Mar 9 17:36:10 CET 2011


Andrew Sullivan <ajs at shinkuro.com> writes:

> On Tue, Mar 08, 2011 at 01:02:14AM +0100, Simon Josefsson wrote:
>> The next-to-last part of IDNA2008-lookup is section 5.5 of RFC 5891:
>> 
>>    The string that has now been validated for lookup is converted to ACE
>>    form by applying the Punycode algorithm to the string and then adding
>>    the ACE prefix ("xn--").
>> 
>> Consider an IDNA2008-lookup input label of "foo".  The above appear to
>> say that this string should be punycode encoded, which seems wrong.
>
> Sections 5.2 and 5.3 suggest that the label should be "in Unicode" and
> need to be a putative U-label.  It is unfortunate that the text
> doesn't explicitly here say that the algorithm already doesn't apply
> to NR-LDH labels, but I think that's correct.  So you shouldn't need
> to run "foo" through punycode because you didn't take that branch: you
> can tell before you get to the IDNA2008 lookup rules that it's an
> NR-LDH label, so it won't be processed.

Thanks -- indeed, making it explicit which sub-sections of section 5
applies to all labels or just (putative) U-labels would have helped.

To verify my understanding: the label "ab--cd" is permitted by IDNA2008
despite it having "--" in the third and fourth characater positions?
That would be because section 5.4 only applies to non-ascii labels.

/Simon


More information about the Idna-update mailing list