IDNA and U+08A1 and related cases (was: Re: Barry Leiba's Discuss on draft-ietf-json-i-json-05: (with DISCUSS and COMMENT))
Shawn.Steele at microsoft.com
Tue Jan 27 20:28:06 CET 2015
I hesitate to reply, but that seems like overkill. At some point I think we get to the point where it's more valuable to provide other anti-phishing/bad guy technologies than focusing on confusables. Focusing on the similarity of a string to another eventually reaches the point of diminishing returns since people are apt to follow egregiously evil links that are clearly unconfusable (at that level at least) anyway.
From: Jefsey [mailto:jefsey at jefsey.com]
Sent: Tuesday, January 27, 2015 10:59 AM
To: Shawn Steele; Mark Davis ☕️; Asmus Freytag
Cc: Barry Leiba; Eliot Lear; Pete Resnick; linuxwolf at outer-planes.net; Andrew Sullivan; IDNA update work; The IESG; draft-ietf-json-i-json.all at tools.ietf.org; John C Klensin; Tim Bray; Martin J.D??rst; Nico Williams; json-chairs at tools.ietf.org
Subject: RE: IDNA and U+08A1 and related cases (was: Re: Barry Leiba's Discuss on draft-ietf-json-i-json-05: (with DISCUSS and COMMENT))
At 18:57 27/01/2015, Shawn Steele wrote:
>Has anyone looked at confusability of Chinese characters? My
>expectation would be that many clearly different things would be easily
>mistaken because of a slight difference in a stroke and the context and
>font size. Eg: If I'm expecting "Microsoft" in an email or something,
>then rnicrosoft.com might trick me (not in this font apparently). I
>would expect that Chinese has lots of characters that are confusable.
>Worse, I'd expect that some are probably only confusable in certain
The only solution to this difficulty is a non-confusability algorithm based upon the UNISIGN base the Catenet Cooperative Corporation (CCC) project proposes to work on. If Microsoft wants to cooperate to its financing it a is Libre project. There have been 16x16 raster tables published by the Chinese Government. Then you are right, pragmtics (i.e. semiotics in context) considerations should apply in the algorithm itself. One context could be domain names.
More information about the Idna-update