Unicode 7.0.0, (combining) Hamza Above, and normalization for comparison
jefsey at jefsey.com
Wed Aug 6 14:01:29 CEST 2014
At 07:03 06/08/2014, Patrik Fältström wrote:
>To be honest, I do not think it matters where it is discussed.
I suggest we keep it discussed here. The reason why is the ICANN
response to the plaintiffs in the .ir, etc. case. "the DNS provides a
human interface to the internet protocol addressing system". This
seems to be a good definition to commonly sustain as it is
technically true, easy to understand, and makes a clear distinction
between the human and the non-human issues.
The most complex issue of the human confusability of the ISO 10646
code points calls for a visual to binary anti-phishing algorithm.
Such an algorithm should be added to the idna table allowing
registries to accept xn-- registrations or not, based upon the domain
names already registered.
To start the debate on this issue I would suggest a possibilty for
such an algorithm: a mathematical proximity confusability
discrimination between character 32x32 rasterizations (i.e. 1024 bits
structured strings). I note that this also implies a common font of
reference: I do not think this is a problem as it is on the human
side and that conflicts will be subject to courts: what counts is the
font local law will consider. Up to each ccTLD to provide that
information and to have it added to ISO 3106, which already includes
the administrative languages we should get renamed anyway as
standardization languages coupled with the accepted script(s).
1. what is the URL of the complete Unicode code point table value/description?
2. I found rasterisations made for different scripts but not for all.
More information about the Idna-update