Visually confusable characters

John C Klensin klensin at jck.com
Thu Aug 7 20:41:15 CEST 2014



--On Thursday, August 07, 2014 10:59 -0700 Eric Brunner-Williams
<ebw at abenaki.wabanaki.net> wrote:

> On 8/7/14 10:37 AM, John C Klensin wrote:
>> ...
>> (3) The question of what one does once one identifies a pair
>> (or set) of characters as "visually confusable" is quite
>> separate from how those characters are identified (something
>> that both the JET effort and ICANN got right (by the time of
>> the VIP activity if not earlier).  There are lots of choices
>> including blocking all of them (which Mark's note seems to
>> suggest), letting one of the group be registered and then
>> blocking the others, making sure that all of them are
>> allocated to or controlled by the same party, trying to link
>> them together at either the DNS or application layers, and
>> other, often more complex, strategies.  I have trouble
>> imagining any basis on which the IETF or an IETF-derived WG
>> list, would be the right place for deciding on those
>> strategies... even if we might help identify some of the
>> possibilities.
> 
> John, colleagues,
> 
> I simply wish to point out that equivalence classes include
> characters which are "visually confusable", and characters
> which have equivalent meaning, e.g., the characters contained
> in Deng, et alia's original intermediate table draft.

Eric,

I think I know what you are referring to (and have no time or
desire to go back and reread at present), but specific
references to something retrievable might help others if you can
find and supply them without too much trouble.

Experience with the evolution of the JET work into ICANN and
other discussions of "variants" has caused me to get anxious
when someone starts talking about the "meaning" of characters,
but that is mostly a separate problem.  I also think I know what
an equivalence class is, but have no idea how one could define
more than one of them over subsets of Unicode (even by
enumeration) and have them both be useful for character
confusion and disjoint.

    john






More information about the Idna-update mailing list