<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">better to count two things:<div><br></div><div>1. length in Unicode code points</div><div>2. length in Punycoded ASCII octets.</div><div><br></div><div>v</div><div><br><div><div>On Sep 16, 2009, at 7:24 AM, jean-michel bernier de portzamparc wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">why not to count in "punycode points", i.e. the number of bytes of the transcoded text + four for the header ?<br>Potzamparc<br><br><br><div class="gmail_quote">2009/9/15 Shawn Steele <span dir="ltr"><<a href="mailto:Shawn.Steele@microsoft.com">Shawn.Steele@microsoft.com</a>></span><br> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">That's fine, I was just curious :)<br> <div class="im"><br> -Shawn<br> <br> -----Original Message-----<br> From: Vint Cerf [mailto:<a href="mailto:vint@google.com">vint@google.com</a>]<br> </div><div><div></div><div class="h5">Sent: Monday, September 14, 2009 14:58<br> To: Shawn Steele<br> Cc: <a href="mailto:idna-update@alvestrand.no">idna-update@alvestrand.no</a><br> Subject: Re: Definitions limit on label length in UTF-8<br> <br> no i am counting in code points regardless of how expressed - is that<br> a reasonably metric?<br> <br> v<br> <br> On Sep 14, 2009, at 3:58 PM, Shawn Steele wrote:<br> <br> > You're suggesting some smaller # of Unicode characters? Are you<br> > counting in UTF-16?<br> ><br> > -Shawn<br> ><br> > -----Original Message-----<br> > From: Vint Cerf [mailto:<a href="mailto:vint@google.com">vint@google.com</a>]<br> > Sent: Monday, September 14, 2009 12:42<br> > To: Shawn Steele; <a href="mailto:idna-update@alvestrand.no">idna-update@alvestrand.no</a><br> > Subject: Re: Definitions limit on label length in UTF-8<br> ><br> > Expressing limits in unicode characters (code points) regardless of<br> > coding<br> > scheme seems like a useful way to offer insight for implementers. V<br> ><br> > ----- Original Message -----<br> > From: <a href="mailto:idna-update-bounces@alvestrand.no">idna-update-bounces@alvestrand.no</a> <<a href="mailto:idna-update-bounces@alvestrand.no">idna-update-bounces@alvestrand.no</a><br> > ><br> > To: <a href="mailto:idna-update@alvestrand.no">idna-update@alvestrand.no</a> <<a href="mailto:idna-update@alvestrand.no">idna-update@alvestrand.no</a>><br> > Sent: Mon Sep 14 12:59:08 2009<br> > Subject: Definitions limit on label length in UTF-8<br> ><br> > FWIW: I think that UTF-8 should NOT be a limit for Punycode. How an<br> > app (or<br> > OS) encodes a decoded Punycode string internally is up to them. I<br> > doubt<br> > we'd express such limits in GB-18030 or EUC-JP?<br> ><br> > The only case I can think of for a UTF-8 limit is in the event<br> > someone made<br> > a UTF-8 clean DNS in the future. However an ASCII punycode label is<br> > clearly<br> > not the same thing, even if it represents a similar string.<br> ><br> > In practice, all imposing a UTF-8 length limit does is to break<br> > IDNA2003<br> > further, make it harder to code, and a little more error-prone.<br> ><br> > -Shawn<br> > _______________________________________________<br> > Idna-update mailing list<br> > <a href="mailto:Idna-update@alvestrand.no">Idna-update@alvestrand.no</a><br> > <a href="http://www.alvestrand.no/mailman/listinfo/idna-update" target="_blank">http://www.alvestrand.no/mailman/listinfo/idna-update</a><br> ><br> <br> <br> _______________________________________________<br> Idna-update mailing list<br> <a href="mailto:Idna-update@alvestrand.no">Idna-update@alvestrand.no</a><br> <a href="http://www.alvestrand.no/mailman/listinfo/idna-update" target="_blank">http://www.alvestrand.no/mailman/listinfo/idna-update</a><br> </div></div></blockquote></div><br></blockquote></div><br></div></body></html>