UCAS and related scripts (was: Re: A comment on
John C Klensin
klensin at jck.com
Mon Apr 7 18:29:42 CEST 2008
--On Monday, 07 April, 2008 08:56 -0700 Eric Brunner-Williams
<ebw at abenaki.wabanaki.net> wrote:
> Thank you for writing back.
> I'm more interested in the full stop (166e in your table) than
> in the symbol for a foreign cult (166d in the same text).
> I'm guessing that the rational for the full stop exclusion is
> not that its "confusingly similar to" some other character,
> but because it is functionally equivalent to "dot" (and is
> always translated into roman script as ".").
> Which rational, or something I've not guessed, is present?
There may be some exceptions, for which it is better that you
check the I-Ds rather than my trying to summarize and get it
wrong but, in general, we have not disallowed characters because
of confusing similarity (aka "phishing risk") alone. In the
case of U+166E, it is excluded because it has the Unicode
General Category "Po" (other punctuation) and all such
characters are Disallowed.
Correctly or incorrectly, U+166D is also in "Po". If I
correctly understand your comments and those of Michael Everson,
it might better have been classified as a symbol of some
flavor, but that would make no difference for our purposes,
since symbols are also disallowed.
If you haven't had time to look at the Rationale document
(draft-klensin-idnabis-issues) yet, the general theme of this
part of the new approach is to return to the "LDH" rule
associated with traditional hostnames and then try to generalize
the "letter" and "digit" parts to corresponding characters in
Unicode, rather than, e.g., trying to see how much can be
included without causing problems. If that formulation is
working out well for indigenous North American writing systems,
it suggests that the rules we are using generally worked rather
than because we conducted a character-by-character examination
and then invented new classifications.
More information about the Idna-update