Draft on IDN Tables in XML
J-F C. Morfin
jfc at morfin.org
Wed Mar 7 14:39:13 CET 2012
At 08:23 07/03/2012, James Mitchell wrote:
>Mixing these concepts is potentially dangerous as one registry may treat
>variants as bundles and another as first-class domain names. I believe
>there is great value in the IDN table describing what characters are
>equivalent. This will allow consumers of the table, when given two names,
>to determine whether or not they are actually part of the same bundle or
>potentially separate names with potentially separate registrants. Anything
>else that represents rulesets for valid names or activate-able variants
>should be avoided.
you are right. But this is the basic problem of using Unicode.
Unicode documents character sets in typesetters (typography) and not
visual character sets in human contexts (orthotypography). There are
two solutions to this.
1) to get rid of Unicode. Unlikely.
2) to use an Man/Unicode conversion algorithm and to use the living
Man (who in addition may evoluate) as a reference instead of the
machine's history (which is also updated).
IDN tables are a limited temporary patch to such an algorithm. We
come back to the RFC 4647 issues ..."what is a language". As long as
this XXe century cross-discipline question is not addressed we will
never know for sure that a cyrilic o in an .au domain name and in an
Australian gTLD are treated the same.
More information about the Idna-update