<br><div class="gmail_quote">On Wed, Sep 2, 2009 at 2:50 AM, Andrew Sullivan <span dir="ltr"><<a href="mailto:ajs@shinkuro.com">ajs@shinkuro.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div class="im">On Wed, Sep 02, 2009 at 02:30:08AM +1000, Wil Tan wrote:<br>
> ><br>
> Agreed. I'm not actually advocating special rules for A-label matching, just<br>
> pointing an inconsistency where label1 is a valid A-label, and is equivalent<br>
> to label2 which is an invalid A-label.<br>
<br>
</div>Yes, this is a strange property of A-labels, but it's ok: A-labels are<br>
a subset of LDH-labels. Until you drew attention to this, it wasn't<br>
obvious to me that even if different LDH-labels matched, one of them<br>
being an A-label was not enough to make the rest of them A-labels.<br>
It's still ok, but it is a subtle point, and the sort of sharp corner<br>
that can snag when people implement for sure.<br>
<div class="im"><br>
</div></blockquote><div><br></div><div>I think I'm ok with this. Again, not too concerned about definitions, except how they are interpreted in the protocol.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div class="im">> If the A-label qualification is just a definition, it wouldn't matter much.<br>
> But it's how we define registration and lookup behavior where A-label is<br>
> concerned that I'm afraid this could cause unintended consequences in<br>
> software implementations.<br>
><br>
> As it is currently defined, IDNA2008 protocol allows a conforming<br>
> applications to behave in different ways (even without mapping).<br>
<br>
</div>This is exactly what I am denying. I think that the definitions are<br>
complete enough that truly conforming applications will never have<br>
this situation. </blockquote><div><br></div><div>I'm not that familiar with the practical aspects of rfc2119, so I could well be wrong. In idnabis-protocol-14, section 5.3 A-label input:</div><div><br></div><div> If the input to this procedure appears to be an A-label (i.e., it</div>
<div> starts in "xn--"), the lookup application MAY attempt to convert it</div><div> to a U-label and apply the tests of Section 5.4 and the conversion of</div><div> Section 5.5 to that form. </div><div><br>
</div><div>So an application doesn't _have_ to convert it to U-label, and goes ahead to lookup the domain name. This would be true of non IDNA-aware applications, as well as conforming IDNA2008 applications that chooses the easy way out (it just cannot display the label in native form.)</div>
<div><br></div><div> If the label is converted to Unicode</div><div> (i.e., to U-label form) using the Punycode decoding algorithm, then</div><div> the processing specified in those two sections MUST be performed, and</div>
<div> the label MUST be rejected if the resulting label is not identical to</div><div> the original.</div><div><br></div><div>If an application gets to this stage, it then MUST validate the U-label, and if there are uppercase characters in there, and MUST refuse the lookup.</div>
<div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">(There is, however, a nasty corner case as a result<br>
of the 0x20 stratagy: as currently defined, no IDNA2008 domain is<br>
compatible with the 0x20 strategy, which is an important thing for<br>
DNSEXT to hear.) Of course, it is quite likely that, working from the<br>
current text, an implementation might not be "truly conforming",<br>
because this subtle point (matching LDH-labels, one of which is an<br>
A-label, need not all be A-labels) could get overlooked.</blockquote><div><br></div><div>I'm not too worried about that though. IDNA2008 is supposed to work at a higher level than DNS, keeping constant the DNS underlying case-insensitive property. As long as the resolver library keeps the original case-permutations and restores it from the answer as part of the demux step, all that case-mutilations only happens on the wire, no?</div>
<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"> I claim that<br>
conforming applications won't behave in different ways because upper<br>
case ASCII characters turn out not to be allowed in A-labels. That's<br>
pretty amazing, but it appears to be true.<br>
<br></blockquote><div><br></div><div>It may just be a case of my not seeing it through the same lenses as you. Please see my earlier comments.</div><div><br></div><div>Thanks.</div><div><br></div><div><br></div><div>=wil</div>
</div>