Unconditional punycode conversion
Simon Josefsson
simon at josefsson.org
Wed Mar 9 17:36:10 CET 2011
Andrew Sullivan <ajs at shinkuro.com> writes:
> On Tue, Mar 08, 2011 at 01:02:14AM +0100, Simon Josefsson wrote:
>> The next-to-last part of IDNA2008-lookup is section 5.5 of RFC 5891:
>>
>> The string that has now been validated for lookup is converted to ACE
>> form by applying the Punycode algorithm to the string and then adding
>> the ACE prefix ("xn--").
>>
>> Consider an IDNA2008-lookup input label of "foo". The above appear to
>> say that this string should be punycode encoded, which seems wrong.
>
> Sections 5.2 and 5.3 suggest that the label should be "in Unicode" and
> need to be a putative U-label. It is unfortunate that the text
> doesn't explicitly here say that the algorithm already doesn't apply
> to NR-LDH labels, but I think that's correct. So you shouldn't need
> to run "foo" through punycode because you didn't take that branch: you
> can tell before you get to the IDNA2008 lookup rules that it's an
> NR-LDH label, so it won't be processed.
Thanks -- indeed, making it explicit which sub-sections of section 5
applies to all labels or just (putative) U-labels would have helped.
To verify my understanding: the label "ab--cd" is permitted by IDNA2008
despite it having "--" in the third and fourth characater positions?
That would be because section 5.4 only applies to non-ascii labels.
/Simon
More information about the Idna-update
mailing list