Punycode Mixed-case annotation
vint at google.com
Sun Jun 28 15:47:48 CEST 2009
Well this is tricky especially if we adopt a practice, for look up, of
I think we want to preserve the definitional idea that punycode A form
and Unicode U form must be convertible.
My understanding is that the punycode algorithm treats upper and lower
case ASCII letters as equivalent
for purposes of conversion (they have the same values in the algorithm).
I hope someone with more facility with the coding algorithms will jump
in at this point.
On Jun 28, 2009, at 9:13 AM, Wil Tan wrote:
> Yes, that would work. Should we also discourage the use of such
> labels, and explicitly say that XN-labels containing uppercase
> characters are not A-labels?
> On Sun, Jun 28, 2009 at 9:26 PM, Vint Cerf<vint at google.com> wrote:
>> If we adopt a policy of mapping prior to look up, and if we map
>> upper case
>> to lower case,
>> it may be that xn--RSUM-bpad.com will be changed to xn-rsum-
>> bpad.com prior
>> to lookup and it will work.
>> On Jun 28, 2009, at 7:20 AM, Wil Tan wrote:
>>> Hi folks,
>>> RFC3492 contained a mixed-case annotation feature which, though not
>>> used in IDNA2003, may affect the IDNA2008 specs. In particular,
>>> code points ([a-z]) that are left unencoded in punycode may be
>>> substituted in upper case, and the result of ToUnicode operation
>>> preserve them. For example,
>>> ToUnicode("xn--RSUM-bpad.com") = "RéSUMé.com"
>>> From reading the rationale and protocol drafts, I'm not entirely
>>> if the input is considered an A-label. The output is certainly not a
>>> U-label since "RSUM" are disallowed codepoints.
>>> I don't know if this is a problem, but it may warrant at least some
>>> discussion in section 5.4 of idnabis-protocol?
>>> Idna-update mailing list
>>> Idna-update at alvestrand.no
More information about the Idna-update