I'm a bit lost.<br><br>> Unicode say ö and ø are different, but that is definitely not what<br>> people in Norway or Sweden (or Denmark for that matter) think.<br><br>It feels like most people are in rough consensus as to the following points.
<br><ol><li>People often consider words with different spellings to be the same word, or at least equivalent.<br></li><li>There
are many, many examples of this:</li><ol><li>telephone and telefon; Duerst and
Dürst; Torbjørn and Torbjörn; Mark and Marc; Teri, Terry, Terri; 中國 "China" (traditional), 中国 "China" (simplified), and so
on.</li><li>[For those without UTF-8 mailers]<br>telephone and telefon; Duerst and D\u00FCrst; Torbj\u00F8rn and Torbj\u00F6rn; Mark and Marc; Teri, Terry, Terri; \u4E2D\u570B "China" (traditional), \u4E2D\u56FD "China" (simplified), and so on.
<br></li></ol><li>These equivalences are very language-dependent: two words
considered equivalent in one language many not be considered equivalent
in other languages, or even in two different orthographies for the same language</li><li>Normalizing or matching these kinds of differences in spellings are outside the scope of IDNA, although country-specific registries might want to take them into account when considering issues such as bundling of domain names
</li></ol>If this is not the case, could someone say where they
disagree with one or more of the above points?<br><br>On the other hand, case and width folding is very different. For <i>lowercasing</i> (case folding)
there is very little variation. The chief standout is the Turkish i.
Yet you really don't want different processes lowercasing differently.
We don't want <href a="
<a href="http://%c3%87i%c3%87ek.com/" target="_blank">http://ÇIÇEK.com</a>"> to be interpreted as two different strings on two different browers:<br>
<p>
</p><ul><li>çiçek = xn--iek-1lab on one system, and</li><li>çıçek = xn--ek-3iaa38a on another system</li></ul>I think it's perfectly reasonable to have a standardized folding of IDN be defined in a different RFC, but I would be concerned if it were missing.
<br><br>Mark<br><br><div class="gmail_quote">On Dec 16, 2007 3:50 PM, Patrik Fältström <<a href="mailto:patrik@frobbit.se">patrik@frobbit.se</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="Ih2E3d"><br>On 17 dec 2007, at 00.09, Harald Tveit Alvestrand wrote:<br><br>> --On 16. desember 2007 15:07 -0800 Erik van der Poel<br>> <<a href="mailto:erikv@google.com">erikv@google.com</a>> wrote:
<br>><br>>> To me, this sounds as though one should not be mapped to the other at<br>>> registration time, so I don't understand why people would be<br>>> interested in treating them as the same codepoint at registration
<br>>> time.<br>><br>> We must take care with terms here.... I believe both Torbjørn and<br>> Torbjörn would be interested in a regime where registering<br>> "<a href="http://torbj%C3%B8rn.se" target="_blank">
torbjørn.se</a>" would not be allowed if "<a href="http://torbj%C3%B6rn.se" target="_blank">torbjörn.se</a>" was already<br>> registered by someone else. But that's bundling, not mapping. And<br>> the "default member" of the bundle (the one that actually goes into
<br>> the zonefile) would be different in Norway and Sweden.<br>><br>> (btw, .no doesn't support bundling at this time. I don't believe .se<br>> does either.)<br><br></div>Correct, .SE does not.<br><br>
What everyone wants is that <a href="http://torbj%C3%B8rn.no" target="_blank">torbjørn.no</a> and <a href="http://torbj%C3%B6rn.se" target="_blank">torbjörn.se</a> and possibly<br><a href="http://torbj%C3%B6rn.no" target="_blank">
torbjörn.no</a> and <a href="http://torbj%C3%B8rn.se" target="_blank">torbjørn.se</a> end up at the same resource in as few<br>hoops to jump through as possible. Specifically it would be<br>"interesting" if <a href="http://torbj%C3%B6rn.se" target="_blank">
torbjörn.se</a> and <a href="http://torbj%C3%B8rn.se" target="_blank">torbjørn.se</a> end up having different<br>domain name holders, because I have no idea what the dispute<br>resolution process would say about it.<br><br>
Unicode say ö and ø are different, but that is definitely not what<br>people in Norway or Sweden (or Denmark for that matter) think.<br><br>On the other hand, I think people in Germany might think o and ö is<br>the same (correct me if I am wrong here), something definitely not the
<br>case in Sweden. Here o and ö are different characters. ö is not o with<br>diaeresis.<br><font color="#888888"><br> Patrik<br><br></font></blockquote></div><br><br clear="all"><br>-- <br>Mark