Leaving out scripts (Re: Unicode versions (Re: Criteria for exceptional characters))

Harald Alvestrand harald at alvestrand.no
Wed Dec 20 00:57:41 CET 2006



--On 19. desember 2006 14:25 -0800 Mark Davis <mark.davis at icu-project.org> 
wrote:

>> Many, including Arabic, Sanskrit and Dhivehi. Possibly Hebrew too. But
>> "leaving out" may be an underspecified term here - see next comment.
>
> Your statement pretty much floored me. Before we remove the ability to
> use domain names from billions of people, it'd be good to have solid,
> defensible reasons for doing so.
>

Let me complete my sentence....

Until we have a decision and an algorithm that we are sure makes sense for 
the use of Arabic modified forms, Arabic vowel marks and Arabic shaping 
modifiers, I think it makes more sense not to register any Arabic domain 
names.

Until we have a decision and an algorithm for determining when ZWJ/ZWNJ 
should be allowed in Sanskrit domain names, I think it makes more sense not 
to register any Sanskrit domain names.

Until we have a rule that allows us to use the vowel marks in Dhivehi 
without causing damage to any other part of the registration set, I think 
it makes more sense not to register any Dhivehi domain names.

As soon as we know that we have a rational decision on the known issues 
with a certain script (ISO 15924 meaning of the word), and a reasonable 
confidence that there are no more issues about to bite us, I'm all for 
allowing registries to say "our policy is to allow registrations that use 
characters from this script (Unicode meaning of the word)".

In *all* these cases, I think it makes sense to tell IDNA implementors on 
the *lookup* side that they should allow those characters to be encoded; if 
an user types them, they should go across the wire. The worst that will 
happen (in the absence of client-side "mappings") is a "no such domain 
name".

                        Harald




More information about the Idna-update mailing list