Definitions limit on label length in UTF-8

Shawn Steele Shawn.Steele at microsoft.com
Mon Sep 14 21:58:52 CEST 2009


You're suggesting some smaller # of Unicode characters?  Are you counting in UTF-16?

-Shawn

-----Original Message-----
From: Vint Cerf [mailto:vint at google.com] 
Sent: Monday, September 14,  2009 12:42
To: Shawn Steele; idna-update at alvestrand.no
Subject: Re: Definitions limit on label length in UTF-8

Expressing limits in unicode characters (code points) regardless of coding
scheme seems like a useful way to offer insight for implementers. V

----- Original Message -----
From: idna-update-bounces at alvestrand.no <idna-update-bounces at alvestrand.no>
To: idna-update at alvestrand.no <idna-update at alvestrand.no>
Sent: Mon Sep 14 12:59:08 2009
Subject: Definitions limit on label length in UTF-8

FWIW: I think that UTF-8 should NOT be a limit for Punycode.  How an app (or
OS) encodes a decoded Punycode string internally is up to them.  I doubt
we'd express such limits in GB-18030 or EUC-JP?

The only case I can think of for a UTF-8 limit is in the event someone made
a UTF-8 clean DNS in the future.  However an ASCII punycode label is clearly
not the same thing, even if it represents a similar string.

In practice, all imposing a UTF-8 length limit does is to break IDNA2003
further, make it harder to code, and a little more error-prone.

-Shawn
_______________________________________________
Idna-update mailing list
Idna-update at alvestrand.no
http://www.alvestrand.no/mailman/listinfo/idna-update



More information about the Idna-update mailing list