Potential Erratum re. length limits in RFC 5890
"Martin J. Dürst"
duerst at it.aoyama.ac.jp
Tue Sep 28 11:28:23 CEST 2010
In section 4.2, RFC 5890 says
Because A-labels (the form actually used in the
DNS) are potentially much more compressed than UTF-8 (and UTF-8 is,
in general, more compressed that UTF-16 or UTF-32), U-labels that
obey all of the relevant symmetry (and other) constraints of these
documents may be quite a bit longer, potentially up to 252 characters
(Unicode code points).
I'm not at all sure where the number of 252 characters is coming from.
It does not at all match the justification given here. Punycode is a
very clever encoding scheme, but it never compresses any character to
less than a full ASCII character. So on this ground, it is impossible to
squeeze more than 63 characters into a label.
The number may have come from my mail entitled
Re: Definitions limit on label length in UTF-8
with date and message id
Date: Sun, 13 Sep 2009 17:05:58 +0900
Message-ID: <4AACA7E6.1070503 at it.aoyama.ac.jp>
where 252 is indeed the largest number appearing, but considerations
were very carefully done in *octets* throughout.
So I think we should submit an erratum to fix this to 252 octets.
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp mailto:duerst at it.aoyama.ac.jp
More information about the Idna-update