I'm a bit puzzled. If I take a "raw" IDN, like<br><br><a href="http://Bücher.com">http://Bücher.com</a><br><br>and paste it into an IDNA unaware browser, it won't work. We should expect that of browsers that doesn't handle IDN. We'd need to paste in a punycode version to work:
<a href="http://xn--bcher-kva.com/">xn--bcher-kva.com</a><br><br>If I take a "raw" IDN, like<br><br><a href="http://Buecher.com">http://Buecher.com</a> // that dot is a full-width dot<br><br>and paste it into an IDNA unaware browser, it also won't work. We should also expect that of browsers that doesn't handle IDN. We'd need to paste in a normalized version to work:
<a href="http://Buecher.com">http://Buecher.com</a><br><br>That is, it doesn't appear that the dot conversion is much different than the punycode conversion (and case/normalization folding) -- something that has to be done before passing off to DNS for it to work correctly.
<br><br>Mark<br><br><div class="gmail_quote">On Dec 8, 2007 5:15 AM, John C Klensin <<a href="mailto:klensin@jck.com">klensin@jck.com</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<br><br>--On Saturday, 08 December, 2007 12:06 +0800 YAO Jiankang<br><div class="Ih2E3d"><<a href="mailto:yaojk@cnnic.cn">yaojk@cnnic.cn</a>> wrote:<br><br>>> Without that mapping, the string cannot be parsed into labels
<br>>> since conventional (legacy) FQDN parsers separate labels<br>>> _only_ on ASCII period, 0x2E, aka U+002E.<br>><br>> true. non IDNA-aware software can not parse IDN.<br>><br>>><br>>> Not being able to parse the string into labels would result in
<br>>> rather serious lookup failures, but the problem is even worse<br>>> because:<br>><br>> if I understand it correctly,<br>> it seems that you have the following assumption:<br>> The domain name with the dot of (ideographic full
<br>> stop), U+FF0E (fullwidth full stop), or U+FF61 (halfwidth<br>> ideographic full stop) is not IDN. so this domain will be<br>> sent to DNS lookup server without IDNA process. actually,<br>> according to RFC3490, it is IDN.
<br>> Since it is IDN, it must be dealt with IDNA before being sent<br>> to DNS lookup. if that happens, there have not the problem as<br>> you said.<br><br></div>That is not my assumption. Perhaps I can explain this better by
<br>means of an example. I can't do this exactly, so suppose that<br>the character "?" is actually U+3002 (ideographic full stop).<br><br>Someone sends me a URL in email. The URL consists of<br><br> <a href="http://www.xn--0xaat.example.com/" target="_blank">
http://www.xn--0xaat.example.com/</a><br><br>where the A-label corresponds to the U-label φοο.<br><br>That example uses standard dots. Suppose I do not have an<br>IDNA-aware browser. But I can take the string from your mail,
<br>paste it in, parse it into<br> "www", "xn--0xaat", "example", and "com",<br>look things up, and obtain the page. That is how IDNA is<br>supposed to work. As long as the user sticks to passing the
<br>ACE form around, applications do not need to be IDNA-aware.<br><br>However, assume that you send me a URL, that looks (substituting<br>"?" as above) like:<br><br> <a href="http://www.xn--0xaat?example.com/" target="_blank">
http://www.xn--0xaat?example.com/</a><br><br>I copy that out and paste it into my browser, which we are still<br>assuming is not IDNA-aware. Because the browser is not<br>IDNA-aware, the domain name is parsed into<br><br>
"www?xn--0xaat", "example" and "com"<br><br>This is obviously wrong and will obviously result in a failure<br>to find the name in a query. Worse, that parsing is performed<br>in places and with software other than DNS resolvers. For
<br>example, there are several security-related protocols that use<br>DNS names as identifiers but keep them in internal DNS form (a<br>list of labels stored with lengths and values, not separated by<br>dots). Depending on how they are designed, even modern
<br>implementations are not required to be IDNA-aware (because IDNA<br>is transparent). But the dot-mappings cannot be transparent:<br>every system, module, or application that has to parse an FQDN<br>into components must know what is, and is not, a
<br>label-separation character.<br><font color="#888888"><br> john<br></font><div><div></div><div class="Wj3C7c"><br><br><br><br><br>_______________________________________________<br>Idna-update mailing list<br><a href="mailto:Idna-update@alvestrand.no">
Idna-update@alvestrand.no</a><br><a href="http://www.alvestrand.no/mailman/listinfo/idna-update" target="_blank">http://www.alvestrand.no/mailman/listinfo/idna-update</a><br></div></div></blockquote></div><br><br clear="all">
<br>-- <br>Mark