Minimal IDNAbis requirements
John C Klensin
klensin at jck.com
Fri Jan 11 18:44:33 CET 2008
--On Friday, 11 January, 2008 17:02 +0100 Stephane Bortzmeyer
<bortzmeyer at nic.fr> wrote:
> On Tue, Jan 01, 2008 at 12:53:21PM -0800,
> Erik van der Poel <erikv at google.com> wrote
> a message of 115 lines which said:
>> For U-labels, string lengths are numbers of codepoints, I
>> suppose. I wonder if it is necessary to explicitly state
>> that. I.e. as opposed to the number of bytes in the UTF-8
>> encoding of the U-label, or the UTF-16 encoding, etc.
> Since Unicode labels are never put in the DNS itself, do we
> really need to enforce the DNS size limit?
In principle, no. In practice, stating a limit on both forms
may prevent, e.g., a collection of buffer overflow
"opportunities". Remember that there are cases in which the
U-label in UTF-8 is longer than the A-label (because punycode
provides better compression behavior for sequential characters
that occupy the same block) and sometimes longer (because
characters that are sufficiently scattered in terms of codepoint
values can be pathological for punycode encoding).
Although I'm trying to push for being careful and conservative
about putting U-labels, some applications (such as email) will
certainly require sending them. That means that someone who is
clever enough to squeeze the last available octet into a label
may have the accidental opportunity to seriously screw up those
who are less careful. That don't sound like good protocol
design to me.
More information about the Idna-update