Tonus (was: Re: Casefolding Sigma (was: Re: IDNAbis Preprocessing Draft))

Patrik Fältström patrik at frobbit.se
Thu Jan 31 13:43:29 CET 2008


On 31 jan 2008, at 13.32, Vint Cerf wrote:

> Patrik, in the LDH world, the upper and lower case forms are kept in  
> the DNS database and are casefolded at matching time. In the IDN  
> world, in part because of the complexity of the normalization  
> process, is it correct that the design does a lot of the normalizing  
> at registration, storing the normalized form in the database rather  
> than the unnormalized form?

It is a deployment issue. One could of course in IDNA2003 version of  
IDNA have forced the normalization in three different "locations" in  
the application/protocol:

(a) In the application before the string touches the DNS at all  
(including before actual storage of the domain name in the DNS server  
at time of registration)

(b) In the server before the actual search is done in the database (so  
normalized strings are still only what is stored in the database)

(c) As part of the matching algorithm so the different non-normalized  
strings are stored in the database

For both (b) and (c) to work, the actual software that is used for DNS  
in the world, at ISPs, at Enterprises etc have to change. And not only  
the one that is used by for example a DNS hosting company that the  
owner of a domain name has chosen because of the new features. But  
also in all caching servers that act as intermediaries between the one  
querying for data and the one serving the data.

Because of historical empirical data on how often software is updated  
on the Internet (which includes how long time it takes to get features  
implemented), a decision was made that it was VERY IMPORTANT that the  
end user, that choose to start using an internationalized domain name,  
MUST be able to do so without waiting for his domain hosting company  
to support it, and without waiting for the ISPs that his customers  
used could support IDN.

An example of slow deployment is this recent "incident" (sorry again  
for that) with me starting using a standard from 1995 that Michael  
Eversons (and probably many more on this list) email client get  
confused on. That is 12 years ago.

Another example of positive slow deployment of new features in a  
similar way is the MIME standard in email, where the "client side can  
upgrade before servers" was a good path. Alternative would be to  
either have a flag day (yeah, right) or force people to wait until  
their ISPs support the new standards. And the latter is exactly what  
in the history Internet is NOT.

Because of all of this, we needed an IDN standard where the matching  
algorithm in DNS, which is byte by byte, is not changed. That forces  
the application to do the work, and that is to do a destructive  
mapping for different codepoints that are supposed to match.

      Patrik



More information about the Idna-update mailing list