IDNA and U+08A1 and related cases (was: Re: Barry Leiba's Discuss on draft-ietf-json-i-json-05: (with DISCUSS and COMMENT))

Wed Jan 28 18:35:53 CET 2015

cc trimmed

At 20:28 27/01/2015, Shawn Steele wrote:
>Content-Transfer-Encoding: base64I hesitate to reply, but that seems 
>like overkill.  At some point I think we get to the point where it's 
>more valuable to provide other anti-phishing/bad guy technologies 
>than focusing on confusables.  Focusing on the similarity of a 
>string to another eventually reaches the point of diminishing 
>returns since people are apt to follow egregiously evil links that 
>are clearly unconfusable (at that level at least) anyway.

I understand. We are just initiating the CCC corporate process and 
considering alliances on our priorities. We consider several 
referential projects (names, numbers, parameters and documentation) 
as such. However the main one we have is UNISIGN for our own use:
* we need non confusable database entries for identifying our 
shareholders from every language, culture and orthotypography.
* while as a Libre project we do not have money to spend on 
verifications due to UNICODE.

UNICODE is a typographic system: they must support every character in 
the world, in the proper way people will want to use them. Our need 
is to support people from the whole world and make sure all of us 
find back their name in the base and get non confusable domain names 
in the "FL" CLASS. These are two different needs. We have at this 
stage no final design in mind. We are certainly not dogmatic but 
operationalist: there is an effective situation we have to best 
address.  Our first need is not for IDNs but for IIDs - and the way 
we consider it for NDNs (Named Data Networks). I just quote it 
because Vint raised the topic and I think that what we could do with 
UNISIGN (when? how? I do not know) may be useful for IDNs and in line 
with ML-DNS.

Best
jfc

>-Shawn
>
>-----Original Message-----
>From: Jefsey [mailto:jefsey at jefsey.com]
>Sent: Tuesday, January 27, 2015 10:59 AM
>To: Shawn Steele; Mark Davis âï¸; Asmus Freytag
>Cc: Barry Leiba; Eliot Lear; Pete Resnick; 
>linuxwolf at outer-planes.net; Andrew Sullivan; IDNA update work; The 
>IESG; draft-ietf-json-i-json.all at tools.ietf.org; John C Klensin; Tim 
>Bray; Martin J.D?Ý; Nico Williams; json-chairs at tools.ietf.org
>Subject: RE: IDNA and U+08A1 and related cases (was: Re: Barry 
>Leiba's Discuss on draft-ietf-json-i-json-05: (with DISCUSS and COMMENT))
>
>At 18:57 27/01/2015, Shawn Steele wrote:
>Has anyone looked at confusability of Chinese characters  My
> >expectation would be that many clearly different things would be easily
> >mistaken because of a slight difference in a stroke and the context and
> >font size.  Eg: If I'm expecting "Microsoft" in an email or something,
> >then rnicrosoft.com might trick me (not in this font apparently).  I
> >would expect that Chinese has lots of characters that are confusable.
>ÛÜÙKIÙ^XÝ@t some are probably only confusable in certain
> >contexts.
>
>The only solution to this difficulty is a non-confusability 
>algorithm based upon the UNISIGN base the Catenet Cooperative 
>Corporation (CCC) project proposes to work on. If Microsoft wants to 
>cooperate to its financing it a is Libre project. There have been 
>16x16 raster tables published by the Chinese Government. Then you 
>are right, pragmtics (i.e. semiotics in context) considerations 
>should apply in the algorithm itself. One context could be domain names.
>
>jfc