IDNA and U+08A1 and related cases (was: Re: Barry Leiba's Discuss on draft-ietf-json-i-json-05: (with DISCUSS and COMMENT))

Tue Jan 27 19:59:27 CET 2015

At 18:57 27/01/2015, Shawn Steele wrote:
>Has anyone looked at confusability of Chinese characters?  My 
>expectation would be that many clearly different things would be 
>easily mistaken because of a slight difference in a stroke and the 
>context and font size.  Eg: If I'm expecting "Microsoft" in an email 
>or something, then rnicrosoft.com might trick me (not in this font 
>apparently).  I would expect that Chinese has lots of characters 
>that are confusable.  Worse, I'd expect that some are probably only 
>confusable in certain contexts.

The only solution to this difficulty is a non-confusability algorithm 
based upon the UNISIGN base the Catenet Cooperative Corporation (CCC) 
project proposes to work on. If Microsoft wants to cooperate to its 
financing it a is Libre project. There have been 16x16 raster tables 
published by the Chinese Government. Then you are right, pragmtics 
(i.e. semiotics in context) considerations should apply in the 
algorithm itself. One context could be domain names.

jfc