Thanks for the thoughtful explanation Andrew, I agree with every aspect of it.<div><br></div><div>=wil<br><br><div class="gmail_quote">On Tue, Jun 30, 2009 at 4:03 AM, Andrew Sullivan <span dir="ltr">&lt;<a href="mailto:ajs@shinkuro.com">ajs@shinkuro.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">On Mon, Jun 29, 2009 at 07:21:22PM +0200, Marie-France Berny wrote:<br>

&gt; 2009/6/29 Andrew Sullivan &lt;<a href="mailto:ajs@shinkuro.com">ajs@shinkuro.com</a>&gt;<br>

&gt; &gt;<br>

&gt; &gt; Please don&#39;t hijack this thread.<br>

&gt;<br>

&gt;<br>

&gt; ????<br>

<br>

I mean that the thread was talking about one thing, and you have<br>

introduced a different topic.  It appears you&#39;re doing so unwittingly,<br>

but I want not to conflate these two topics.<br>

<br>

&gt; The mapping of lower-case non-ASCII characters with respect to upper-case<br>

&gt; &gt; apparently-ASCII characters is _not_ the same question as the effects of<br>

&gt; &gt; lower- and upper-case ASCII across the U-label/A-label boundary.<br>

&gt;<br>

&gt;<br>

&gt; I am sorry. I have not the slightest idea of what you are talking about. I<br>

&gt; read an attempt to come to a quick conclusion regarding punycode and where<br>

&gt; to carry mapping. Or am I wrong?<br>

<br>

Wrong, I&#39;m afraid.  The specific question was about ASCII characters<br>

that _remain ASCII_ when using Punycode to transform the label.  So<br>

for instance, in<br>

<br>

    abcdé<br>

<br>

and<br>

<br>

    ABCDé<br>

<br>

the &#39;abcd&#39; and &#39;ABCD&#39; parts are not, strictly speaking, touched by<br>

Punycode.  Under IDNA2003 there&#39;s a simple answer for this, because of<br>

the way it works.  Under IDNA2008, the earliest proposals did no<br>

mapping at all, and we haven&#39;t settled what mapping if any will<br>

happen.  Therefore, there is a question about what to do with these<br>

particular cases.<br>

<br>

&gt; As far as I understand, there is one clarification missing. It is what do<br>

&gt; you define as &quot;global&quot; in here. Are French (and possibly Persian, and<br>

&gt; probably many others...) included?<br>

<br>

Yes, in the sense that there is one giant domain name system under<br>

which everything has to fit, because the whole system is a tree<br>

structure with one root.  (I&#39;ll leave aside for the moment the<br>

possibility of &quot;alternate roots&quot;, since every actual example of that<br>

is in fact just a change of the servers holding the &quot;unique root&quot;, and<br>

not a change to the principle that there is a spot where the namespace<br>

starts.)<br>

<br>

If you mean, &quot;Will it support French, Persian, English, Chinese,<br>

Arabic, and any other language Unicode supports in ways that are<br>

completely natural to the readers and writers of those languages?&quot; the<br>

answer is, &quot;No, and that was never the goal.&quot;  As several people have<br>

said several times, the goal is not to be able to write literature in<br>

the DNS.  The goal is just to internationalize the DNS, subject to the<br>

limitations of the existing DNS.<br>

<br>

One of those limitations turns out to be the (in my opinion<br>

unfortunate) DNS property that it is case-preserving but<br>

case-insensitive.  As a historical fact, ExAmPlE.org, <a href="http://example.org" target="_blank">example.org</a>,<br>

EXAMPLE.org, <a href="http://EXAMPLE.ORG" target="_blank">EXAMPLE.ORG</a>, and example.ORG are all &quot;equivalent&quot; for the<br>

matching rules.  On my interpretation, the DNS server ought to return<br>

an answer to any of those queries with the name as it appears in the<br>

zone file, but some do other things (such as return a pointer to the<br>

question section, which means you get back the form as you asked it).<br>

<br>

What you are asking is, I&#39;m sure, a completely natural extension of<br>

that principle in your view: you want école.fra to match ECOLE.FRA.<br>

The problem is that this doesn&#39;t work the same way, because ecole.fra<br>

and ECOLE.FRA also match each other, so now we have an ambiguous<br>

combination.  And that&#39;s only in the case where you actually know the<br>

label is &quot;in French&quot; -- already an extremely complicated problem,<br>

since we don&#39;t have a universally agreed-upon authority as to what<br>

language any given word is in.  (You can&#39;t learn it from the DNS<br>

without either an additional query or special processing on the server<br>

side, both of which rules are, as far as I understand, antirequisites<br>

for the current work.)<br>

<br>

Note that, in some contexts in English, it would be very surprising<br>

that case didn&#39;t matter.  If case were not important in English, then<br>

we would have lost them some time ago (also, a signficant body of<br>

poetic work would be affected).  This is not a battle between people<br>

who speak English and whose every natural impulse is accommodated<br>

vs. everyone else.  It&#39;s just a matter of finding the set of<br>

compromises that will fit within the compromises that were already set<br>

when the DNS became successful.<br>

<br>

All of the above said, as far as I know the mapping document is still<br>

open for comment.  If you know some way by which these mappings are<br>

achievable, I&#39;m sure everyone would love to hear them.<br>

<br>

Best regards,<br>

<br>

A<br>

<font color="#888888"><br>

--<br>

Andrew Sullivan<br>

<a href="mailto:ajs@shinkuro.com">ajs@shinkuro.com</a><br>

Shinkuro, Inc.<br>

_______________________________________________<br>

Idna-update mailing list<br>

<a href="mailto:Idna-update@alvestrand.no">Idna-update@alvestrand.no</a><br>

<a href="http://www.alvestrand.no/mailman/listinfo/idna-update" target="_blank">http://www.alvestrand.no/mailman/listinfo/idna-update</a><br>

</font></blockquote></div><br></div>