Standards and localization (was Dot-mapping)
YAO Jiankang
yaojk at cnnic.cn
Sat Dec 8 20:48:35 CET 2007
A very good example.
yes, "http://Bücher.com" and " http://Buecher.com // that dot is a full-width dot"
are no big differences. both should be dealt with IDNA before being sent to DNS to look up.
Thanks a lot for your nice example and practice.
YAO Jiankang
----- Original Message -----
From: Mark Davis
To: John C Klensin
Cc: YAO Jiankang ; Yangwoo Ko ; idna-update at alvestrand.no ; fujiwara at jprs.co.jp
Sent: Sunday, December 09, 2007 3:28 AM
Subject: Re: Standards and localization (was Dot-mapping)
I'm a bit puzzled. If I take a "raw" IDN, like
http://Bücher.com
and paste it into an IDNA unaware browser, it won't work. We should expect that of browsers that doesn't handle IDN. We'd need to paste in a punycode version to work: xn--bcher-kva.com
If I take a "raw" IDN, like
http://Buecher.com // that dot is a full-width dot
and paste it into an IDNA unaware browser, it also won't work. We should also expect that of browsers that doesn't handle IDN. We'd need to paste in a normalized version to work: http://Buecher.com
That is, it doesn't appear that the dot conversion is much different than the punycode conversion (and case/normalization folding) -- something that has to be done before passing off to DNS for it to work correctly.
Mark
On Dec 8, 2007 5:15 AM, John C Klensin <klensin at jck.com> wrote:
--On Saturday, 08 December, 2007 12:06 +0800 YAO Jiankang
<yaojk at cnnic.cn> wrote:
>> Without that mapping, the string cannot be parsed into labels
>> since conventional (legacy) FQDN parsers separate labels
>> _only_ on ASCII period, 0x2E, aka U+002E.
>
> true. non IDNA-aware software can not parse IDN.
>
>>
>> Not being able to parse the string into labels would result in
>> rather serious lookup failures, but the problem is even worse
>> because:
>
> if I understand it correctly,
> it seems that you have the following assumption:
> The domain name with the dot of (ideographic full
> stop), U+FF0E (fullwidth full stop), or U+FF61 (halfwidth
> ideographic full stop) is not IDN. so this domain will be
> sent to DNS lookup server without IDNA process. actually,
> according to RFC3490, it is IDN.
> Since it is IDN, it must be dealt with IDNA before being sent
> to DNS lookup. if that happens, there have not the problem as
> you said.
That is not my assumption. Perhaps I can explain this better by
means of an example. I can't do this exactly, so suppose that
the character "?" is actually U+3002 (ideographic full stop).
Someone sends me a URL in email. The URL consists of
http://www.xn--0xaat.example.com/
where the A-label corresponds to the U-label φοο.
That example uses standard dots. Suppose I do not have an
IDNA-aware browser. But I can take the string from your mail,
paste it in, parse it into
"www", "xn--0xaat", "example", and "com",
look things up, and obtain the page. That is how IDNA is
supposed to work. As long as the user sticks to passing the
ACE form around, applications do not need to be IDNA-aware.
However, assume that you send me a URL, that looks (substituting
"?" as above) like:
http://www.xn--0xaat?example.com/
I copy that out and paste it into my browser, which we are still
assuming is not IDNA-aware. Because the browser is not
IDNA-aware, the domain name is parsed into
"www?xn--0xaat", "example" and "com"
This is obviously wrong and will obviously result in a failure
to find the name in a query. Worse, that parsing is performed
in places and with software other than DNS resolvers. For
example, there are several security-related protocols that use
DNS names as identifiers but keep them in internal DNS form (a
list of labels stored with lengths and values, not separated by
dots). Depending on how they are designed, even modern
implementations are not required to be IDNA-aware (because IDNA
is transparent). But the dot-mappings cannot be transparent:
every system, module, or application that has to parse an FQDN
into components must know what is, and is not, a
label-separation character.
john
_______________________________________________
Idna-update mailing list
Idna-update at alvestrand.no
http://www.alvestrand.no/mailman/listinfo/idna-update
--
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20071209/30d98b61/attachment.html
More information about the Idna-update
mailing list