Potential Erratum re. length limits in RFC 5890

Markus Scherer markus.icu at gmail.com
Wed Sep 29 00:12:38 CEST 2010


On Tue, Sep 28, 2010 at 3:02 PM, Kenneth Whistler <kenw at sybase.com> wrote:

> Those are the minimum and maximum cases. For some more
> typical mix of characters from the BMP, the UTF-8 length
> will be >= 63 and <= 252 octets.
>

Correct. (Or actually, with a "mix of characters from the BMP" you would get
at most 189 UTF-8 bytes.)

However, Martin quoted the RFC as saying

  [...] U-labels that
  obey all of the relevant symmetry (and other) constraints of these
  documents may be quite a bit longer, potentially up to 252 characters
  (Unicode code points).

How to 252 *Unicode code points* relate to the A-label length limit of 63
*octets* (where each is an ASCII letter or digit)?
(Aside from the xn-- prefix which reduces the label contents to 59 octets.)

markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/idna-update/attachments/20100928/3480bee8/attachment.html>


More information about the Idna-update mailing list