Mapping and Variants

Mark Davis mark at macchiato.com
Mon Mar 9 23:43:15 CET 2009


I think ultimately it is going to be the client software that has the
greatest role in warning people about suspicious usage. ICANN can only
affect some of the top level registries, not the many levels of
subregistries. And the protocol can't forbid confusable characters without
also forbidding completely reasonable characters. The rules and code
required to distinguish them is far too complicated, and far too fluid, to
try to nail into a protocol. It'd really require some kind of dynamic
updating process, like what browsers and virus-detection programs have.

Take the following, with a second-level registry.

1. European character
http://café.blogspot.com/ <http://xn--caf-dma.blogspot.com/>

2. Uses a variant of b used in Hausa (also an IPA character). Confusable
with b in some fonts/sizes.
http://ɓlog.blogspot.com/ <http://xn--log-nsb.blogspot.com/>

3. Uses a Greek o:
http://blоg.blogspot.com/ <http://xn--blg-ted.blogspot.com/>

In Safari, for example, the first two are displayed normally (Unicode),
while the last is displayed as Punycode, indicating that something is fishy
(mixed scripts). Number 2 is difficult, because it could be a perfectly good
word in Hausa.

Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090309/6d09bb33/attachment.htm 


More information about the Idna-update mailing list