Visually confusable characters
asmusf at ix.netcom.com
Sat Aug 9 04:20:49 CEST 2014
On 8/8/2014 11:32 AM, Jefsey wrote:
>> *At 02:27 08/08/2014, Andrew Sullivan wrote:*
>> I don't think that's a fair characterization. Nobody is
>> "second-guessing" anything. It's rather that we -- John, actually --
>> discovered that there's a consequence of this case that we did not
>> previously understand, and it has uncomfortable consequences for the
>> way we had previously relied on Unicode, because it didn't work the
>> way we thought.
> Dear Andrew,
> May be time to reconsider the idea of an IETF Unicode including our
> exception management through an additional protocol rather than only
> by Patrik's tables?
"Additional protocol" sounds like it's headed in the right direction.
There are already several levels to this
* Unicode (repertoire and basic normalization)
* IDNA (including repertoire and context rules)
* Label Generation Rulesets (including repertoire, context rules and
* String Review (case by case)
Of these, the formulation of Label Generation Rulesets allow a solution
to issues like these that can be used to address issues like the current
one without the need to pick an arbitrary preferred encoding. They
provide ways to specify a first-come, first-serve, but mutually
exclusive selection among alternatives, which is much less
"linguistically damaging" than blunt restrictions repertoire alone.
What is missing, but what keeps surfacing in the discussions around
creating the LGR for the Root Zone is the need for enforceable "best
practices" on LGRs.
If there was an "additional protocol" where problematic cases could be
identified and translated into a binding requirement on LGRs (and
therefore registration policies) to either disallow all but one of the
alternatives, or to have a robust way of mutually excluding labels from
registration (using the blocked variant mechanism) then it would seem
that you get the effect of robust lookup, without having to arbitrarily
play linguistic favorites.
The same protocol could be applied to handle any new registrations for
the many similar cases of homoglyphs and homographs, whether across
scripts or within scripts.
Being less "linguistically damaging" it is amenable to be employed in a
wider selection of cases as well.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Idna-update