Definitions limit on label length in UTF-8

John C Klensin klensin at jck.com
Tue Sep 15 14:49:24 CEST 2009



--On Tuesday, September 15, 2009 10:57 +0900 "\"Martin J.
Dürst\"" <duerst at it.aoyama.ac.jp> wrote:

>...
> As for limits in codepoints, that limit is 63 codepoints. But
> in all cases, these limits only apply to valid Unicdoe, not to
> stuff before mapping.

And, fwiw, it is worth remembering that "length in codepoints"
is not the same as "length in characters" as understood by most
casual users, i.e., "length in print positions" or the
equivalent.  For scripts that, because of the way Unicode is
structured, require the use of a lot of combining characters,
"length in number of print positions" may be significant shorter
than "length in codepoints" -- one can imagine half as long or
even shorter with carefully-constructed (pathological) strings.


There is nothing the WG can (or should try to) do about that,
but we should be careful to avoid saying anything that implies
otherwise.

   john



More information about the Idna-update mailing list