Tables and contextual rule for Katakana middle dot
Eric Brunner-Williams
ebw at abenaki.wabanaki.net
Wed Apr 8 00:50:43 CEST 2009
Mark Davis wrote:
> ...
>
> There are other dot-like characters that are far more visually similar
> to dot, like Arabic zero.
>
> عربي٠عربي.com <http://xn--ngbazb1bc2jd8q.com>
> vs
> عربي.عربي.com <http://xn--ngbrx4e.xn--ngbrx4e.com>
>
> But more importantly, there is a real lack of data presented for these
> kinds of positions. When excluding characters that are in common use
> on the basis of visual confusability, such as Katakana middle dot,
> let's see some real data on what a difference this would make in
> overall visual confusability of characters. Of all of the visually
> confusable characters in PVALID, what would be the percentage
> difference but adding or removing Katakana middle dot? And why do
> people think this can't be handled by exactly the same mechanisms that
> programs have to handle the visually confusable characters that *are*
> PVALID.
I applied the no-punctuation principle when looking at U+166E. However,
it (a very small baseline aligned "x") really doesn't look like a label
separator, and there really is no harm in a Cree full stop appearing
within a Cree character string, creating labels of the form
"whatever<dot>cree-sentence-1<mini-x>cree-sentence-2<dot>else.
So, I have some second thoughts about DISALLOWED for U+166E. The case
for PVALID is ... well ... inventive, not compelling.
For U+166D, a symbol, I'm willing to keep it DISALLOWED, for several
reasons.
I wrote this because I think you're correctly asking what's really
confusable, and the question is larger than just Katakana middle dot.
Eric
More information about the Idna-update
mailing list