Leaving out scripts (Re: Unicode versions (Re: Criteria for exceptional characters))

Mark Davis mark.davis at icu-project.org
Wed Dec 20 02:19:51 CET 2006


We must recognize that we are not starting at ground zero. There are already
many Arabic domain names. What your proposal would mean is removing existing
ones. The same goes for Devanagari (I think you meant that instead of
Sanskrit) which is used for Hindi and a number of other languages. Removing
these would affect many hundreds of millions of people, and be, I have no
doubt whatsoever, not greeted with acclaim in the developing world. If you
want to make the newspapers -- and not in a positive way -- this would really
do it!

All of the issues you cite are well understood, and ZWJ/ZWNJ/trailing BIDI
Mn are narrow extensions of what exists already. Moreover, I think we have
to be clear about the impact of these changes. For example, the absence of
ZWNJ is actually not an issue for most of the languages using the Arabic
script. It is, however, an issue for some, like Iranian. Removing the
ability to use the Arabic script for ALL users, because of ZWNJ, is complete
overkill. It's hard to draw a precise analogy with Latin, but suppose that
IDNA didn't support 'x', and someone said that because of that, we need to
retract all IDNs using the Latin script.

The current absence of ZWNJ is a handicap for Iranian for expressing certain
words, but I can't imagine that any Arabic-script users would be happier if
you retracted all ability to use the Arabic script by all users because of
this issue. It'd be like closing all of interstate 5, because until you can
fix a blocked exit in Sacremento. Better to let people use the
freeway, and work
to unblock that exit.

Mark

On 12/19/06, Harald Alvestrand <harald at alvestrand.no> wrote:
>
>
>
> --On 19. desember 2006 14:25 -0800 Mark Davis <mark.davis at icu-project.org>
> wrote:
>
> >> Many, including Arabic, Sanskrit and Dhivehi. Possibly Hebrew too. But
> >> "leaving out" may be an underspecified term here - see next comment.
> >
> > Your statement pretty much floored me. Before we remove the ability to
> > use domain names from billions of people, it'd be good to have solid,
> > defensible reasons for doing so.
> >
>
> Let me complete my sentence....
>
> Until we have a decision and an algorithm that we are sure makes sense for
> the use of Arabic modified forms, Arabic vowel marks and Arabic shaping
> modifiers, I think it makes more sense not to register any Arabic domain
> names.
>
> Until we have a decision and an algorithm for determining when ZWJ/ZWNJ
> should be allowed in Sanskrit domain names, I think it makes more sense
> not
> to register any Sanskrit domain names.
>
> Until we have a rule that allows us to use the vowel marks in Dhivehi
> without causing damage to any other part of the registration set, I think
> it makes more sense not to register any Dhivehi domain names.
>
> As soon as we know that we have a rational decision on the known issues
> with a certain script (ISO 15924 meaning of the word), and a reasonable
> confidence that there are no more issues about to bite us, I'm all for
> allowing registries to say "our policy is to allow registrations that use
> characters from this script (Unicode meaning of the word)".
>
> In *all* these cases, I think it makes sense to tell IDNA implementors on
> the *lookup* side that they should allow those characters to be encoded;
> if
> an user types them, they should go across the wire. The worst that will
> happen (in the absence of client-side "mappings") is a "no such domain
> name".
>
>                         Harald
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20061219/936d130a/attachment.html


More information about the Idna-update mailing list