IDNA 2008 Question Re: "Confusable" Characters in Domain Names

JFC Morfin jefsey at jefsey.com
Sat Nov 6 00:17:37 CET 2010


At 15:17 05/11/2010, john daw wrote:
>JFC Morfin wrote:
>
>2.3. The IDNA concept of Internationalizing Domain Names in 
>Applications (IDNA), as
>shown by the example above, is an architectural error on the user 
>side. However, this
>error is in operation, so we needed to continue supporting it, to 
>decide and document
>an alternative, to protect the Internet from it, and to transition from it.

Dear John,

Andrew Sullivan answered you clearly.  Also John Klensin. There is an 
Internet proposition of which the IETF is responsible for. And 
protected by DNS zone coherence as Shawn Steel  reminded it. Yet 
"€.net" resolves and .net is not openroots.

Now, there also are digital ecosystem users' needs and rights. The 
Internet is part of the world digital ecosystem network.

There is therefore a border. We use to identify it between the 
internet which carries content in using typographic signs, and the 
"intersem" which is to carry thoughts (by metareference). The 
difficulty resides in precisely defining such a border when much 
depends from a third party tool (Unicode) proposed by the consortium 
of the largest commercial Internet contributors while the Intersem is 
just an emerging area.

Unicode is a computing industry standard for the consistent encoding, 
representation and handling of text expressed in most of the world's 
writing systems.This is why Unicode does support text syntax (from 
Ancient Greek "arrangement" from syn, "together", and táxis, "an 
ordering") which is the principles and rules for constructing 
sentences in natural languages. Globalization is internationalizing 
(Unicode support) the media, localizing the ends (Unicode CLDR 
projects), and filtering the content (Unicode initiated langtags) in 
order to support texts in every script and languages. Therefore 
Unicode seemed to be the proprer tool to "internationalize" the 
Internet as part of a globalized text oriented support.

IDNA2003 was one of the experiments shown that the world digital 
ecosystem needs more than globalization: the WDE needs 
"multilinguisation" you can define as the way the global system will 
survive every language (including idolects) to be supported equal.

The IAB identified that IDNA2003 was inadequate for several practical 
reasons. The very reason is that domain names are not text, but 
terms. The IETF response was to free IDNA from Unicode. IDNA2003 is 
directly built over Unicode. IDNA2008 is not. It is based upon RFC 
5892. This fixed the issues on the Internet side (Internet endotem).

Now, why is a domain name a term and not a text. This is because the 
meaning of a domain names does not depends on the arrangement of its 
label but on the arrangement of the characters within its labels, on 
their typographical syntax (as a TM). To transport that through the 
IDNA punycode process there is a need to support semantic related 
metadesign information. The Unicode system does not support it.

Actually, you quote the support of "€" or "£". Andrew's objection 
about "$" does not stand. He says "$.com" was never supported, that 
enough. Well "€.net" is, that is also enough. The true point is that 
there are many more signs, ans sometimes Unicode or someone else (why 
not you an me) will start registering commercial logos signs (this 
should be defused before it mares the internet economy.

These are however "special" cases. We have a much more common case 
which also could have been a casus belli: the French and latin 
language majuscules. "etat" and "Etat" have by no means the same meaning.

This is why once IDNA2008 was approved, as a promoter of the 
"Projet.FRA" (a French language namespace), I reported the IESG that 
I could not be satisfied by IDNA and I will support .FRA in using my 
own interplus solution which takes advantage from the ML-DNS. I also 
urged the IESG to adopt IDNA2008 now, not waiting for a complete 
solution on the user side. The reason why is that "Orthotypographic 
rules vary broadly from language to language, from country to 
country, and even from publisher to publisher (in our case from TLD 
to TLD). As such, they are more often described as "conventions"". 
Conventions need a stone carved common initial convention to deploy. 
The IESG approved IDNA2008 and now we are to digest it.

IMHO RFC 5892 is the current IDNA rosetta stone, but we will need to 
go a layer deeper and get a NUCS (network universal character set) 
defined with other network related technologies and needs. And then 
we will use the resulting CLASS NUCS domain names to add 
metainformations that will go through punycode.

You will have "$.com" then supported by IDNA, transparently to the 
DNS ,and IDNA2008. The €.net existing owner will therefore survive 
the transition from IDNA2003 to UDNA2008. But it is likely that its 
market value will plunge as ML-DNS will legitimately support millons 
of them and millions of TLDs :-)

Cheers.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/idna-update/attachments/20101106/f49f9f8b/attachment.html>


More information about the Idna-update mailing list