ASCII- vs non-ASCII mappings (was: Punycode Mixed-case annotation)
dready at gmail.com
Tue Jun 30 12:24:23 CEST 2009
Thanks for the thoughtful explanation Andrew, I agree with every aspect of
On Tue, Jun 30, 2009 at 4:03 AM, Andrew Sullivan <ajs at shinkuro.com> wrote:
> On Mon, Jun 29, 2009 at 07:21:22PM +0200, Marie-France Berny wrote:
> > 2009/6/29 Andrew Sullivan <ajs at shinkuro.com>
> > >
> > > Please don't hijack this thread.
> > ????
> I mean that the thread was talking about one thing, and you have
> introduced a different topic. It appears you're doing so unwittingly,
> but I want not to conflate these two topics.
> > The mapping of lower-case non-ASCII characters with respect to upper-case
> > > apparently-ASCII characters is _not_ the same question as the effects
> > > lower- and upper-case ASCII across the U-label/A-label boundary.
> > I am sorry. I have not the slightest idea of what you are talking about.
> > read an attempt to come to a quick conclusion regarding punycode and
> > to carry mapping. Or am I wrong?
> Wrong, I'm afraid. The specific question was about ASCII characters
> that _remain ASCII_ when using Punycode to transform the label. So
> for instance, in
> the 'abcd' and 'ABCD' parts are not, strictly speaking, touched by
> Punycode. Under IDNA2003 there's a simple answer for this, because of
> the way it works. Under IDNA2008, the earliest proposals did no
> mapping at all, and we haven't settled what mapping if any will
> happen. Therefore, there is a question about what to do with these
> particular cases.
> > As far as I understand, there is one clarification missing. It is what do
> > you define as "global" in here. Are French (and possibly Persian, and
> > probably many others...) included?
> Yes, in the sense that there is one giant domain name system under
> which everything has to fit, because the whole system is a tree
> structure with one root. (I'll leave aside for the moment the
> possibility of "alternate roots", since every actual example of that
> is in fact just a change of the servers holding the "unique root", and
> not a change to the principle that there is a spot where the namespace
> If you mean, "Will it support French, Persian, English, Chinese,
> Arabic, and any other language Unicode supports in ways that are
> completely natural to the readers and writers of those languages?" the
> answer is, "No, and that was never the goal." As several people have
> said several times, the goal is not to be able to write literature in
> the DNS. The goal is just to internationalize the DNS, subject to the
> limitations of the existing DNS.
> One of those limitations turns out to be the (in my opinion
> unfortunate) DNS property that it is case-preserving but
> case-insensitive. As a historical fact, ExAmPlE.org, example.org,
> EXAMPLE.org, EXAMPLE.ORG, and example.ORG are all "equivalent" for the
> matching rules. On my interpretation, the DNS server ought to return
> an answer to any of those queries with the name as it appears in the
> zone file, but some do other things (such as return a pointer to the
> question section, which means you get back the form as you asked it).
> What you are asking is, I'm sure, a completely natural extension of
> that principle in your view: you want école.fra to match ECOLE.FRA.
> The problem is that this doesn't work the same way, because ecole.fra
> and ECOLE.FRA also match each other, so now we have an ambiguous
> combination. And that's only in the case where you actually know the
> label is "in French" -- already an extremely complicated problem,
> since we don't have a universally agreed-upon authority as to what
> language any given word is in. (You can't learn it from the DNS
> without either an additional query or special processing on the server
> side, both of which rules are, as far as I understand, antirequisites
> for the current work.)
> Note that, in some contexts in English, it would be very surprising
> that case didn't matter. If case were not important in English, then
> we would have lost them some time ago (also, a signficant body of
> poetic work would be affected). This is not a battle between people
> who speak English and whose every natural impulse is accommodated
> vs. everyone else. It's just a matter of finding the set of
> compromises that will fit within the compromises that were already set
> when the DNS became successful.
> All of the above said, as far as I know the mapping document is still
> open for comment. If you know some way by which these mappings are
> achievable, I'm sure everyone would love to hear them.
> Best regards,
> Andrew Sullivan
> ajs at shinkuro.com
> Shinkuro, Inc.
> Idna-update mailing list
> Idna-update at alvestrand.no
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Idna-update