Definitions limit on label length in UTF-8

"Martin J. Dürst" duerst at it.aoyama.ac.jp
Wed Sep 16 10:03:43 CEST 2009


On 2009/09/15 21:49, John C Klensin wrote:

> And, fwiw, it is worth remembering that "length in codepoints"
> is not the same as "length in characters" as understood by most
> casual users, i.e., "length in print positions" or the
> equivalent.  For scripts that, because of the way Unicode is
> structured, require the use of a lot of combining characters,
> "length in number of print positions" may be significant shorter
> than "length in codepoints" -- one can imagine half as long or
> even shorter with carefully-constructed (pathological) strings.

[mostly off-topic]
In addition, "number of print positions" is in itself a rather vague and 
not very useful concept for scripts that don't have much of a tradition 
of using the same width for each character.

Regards,    Martin.

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst at it.aoyama.ac.jp


More information about the Idna-update mailing list