Draft on IDN Tables in XML

J-F C. Morfin jfc at morfin.org
Wed Mar 7 14:39:13 CET 2012

At 08:23 07/03/2012, James Mitchell wrote:
>Mixing these concepts is potentially dangerous as one registry may treat
>variants as bundles and another as first-class domain names. I believe
>there is great value in the IDN table describing what characters are
>equivalent. This will allow consumers of the table, when given two names,
>to determine whether or not they are actually part of the same bundle or
>potentially separate names with potentially separate registrants. Anything
>else that represents rulesets for valid names or activate-able  variants
>should be avoided.


you are right. But this is the basic problem of using Unicode. 
Unicode documents character sets in typesetters (typography) and not 
visual character sets in human contexts (orthotypography). There are 
two solutions to this.

1) to get rid of Unicode. Unlikely.
2) to use an Man/Unicode conversion algorithm and to use the living 
Man (who in addition may evoluate) as a reference instead of the 
machine's history (which is also updated).

IDN tables are a limited temporary patch to such an algorithm. We 
come back to the RFC 4647 issues ..."what is a language". As long as 
this XXe century cross-discipline question is not addressed we will 
never know for sure that a cyrilic o in an .au domain name and in an 
Australian gTLD are treated the same.


More information about the Idna-update mailing list