Tatweel

Ebw ebw at abenaki.wabanaki.net
Fri Mar 20 16:55:37 CET 2009


At the risk of drawing the usual flak ...

"legislate" and some other word choices ("regulate" is another) are  
misleading, as the relationship between gTLD registries and ICANN is  
contractual, and the relationship between ccTLD registries is non- 
contractual, though in some cases letters or memorada exist, and the  
relationship between ICANN and second level and subordinate registries  
is undefined.

In general, registries may elect an IETF IDN, and registries with a  
contractual relation with ICANN must, but where a registry comes to  
the position that there is reason to "do something else", whether for  
reasons of availability or correctness, and no other constraint is  
controlling, the registry may not adopt, or adopt in part, or modify,  
any IETF IDN protocol.

Restated, the mechanism(s) to compel are very limited, but the choice  
not to adopt is less limited.

Broadly, our challenge is to achive sufficient technical correctness  
to limit the necessity of registries to not-adopt, or adopt-with- 
modification.

I was confused by Ken's notes. At first I understood his rational was  
visual similarity, his second note offered non-semantic, and a code- 
point by code-point tour of all current glyphs/characters is not  
attractive. Doing just the Cree characters for dot-before and dot- 
after nuances took me days.

Eric

Sent from my iPhone, painfully.

On Mar 20, 2009, at 9:5 AM, Alireza Saleh <saleh at nic.ir> wrote:

> Vint,
>
> I'm really in substantive agreement with you. Something I wish to  
> add is
> that IETF should assume that registries are competent enough to make
> wise decisions on their own; it should not legislate on the basis that
> some registries are not competent enough to know what is good and safe
> for them. As you say, there should be compelling reasons for making
> exclusions in IDNA2008. Perhaps we should delineate what constitutes
> 'compelling reason'.
>
> Best
> Alireza
>
>
> Vint Cerf wrote:
>> Alireza,
>>
>> your note makes me think a bit about what I believe to be the
>> difference in philosophy between IDNA2003 and IDNA2008. Under
>> IDNA2008, effort has been made to be fairly cautious about what is
>> included by using Unicode's characterizations of the role of
>> characters. Appearance has less to do with this than function in
>> expression. Generally, punctuation is excluded except for special
>> cases such as ZWJ/ZWNJ for example. I am not a speaker nor a reader  
>> of
>> Arabic script so I have to be guided by others who are expert but it
>> sounds on the surface as if the proposal is related to the function  
>> of
>> U+0640. Ideally, inclusion or exclusion should be the product of the
>> Rules that generate the tables of the Tables document (editor Patrik
>> Faltstrom). If it is not ruled out (literally) but there is a
>> compelling argument for exclusion, it would need to become an
>> exception I believe.
>>
>> Mark,
>>
>> One of the many concerns I have heard raised on this list relates to
>> character-by-character assessment of Unicode as it applies to IDNs. I
>> think few people wish to produce IDNA tables that way. I don't  
>> dispute
>> your reasoning to exclude (I don't know enough about Arabic to do so)
>> but I am wondering whether there is a way to do this that is
>> rule-based or context based or something that exercises the  
>> mechanisms
>> of IDNA2008?
>>
>> vint
>>
>>
>>
>> Vint Cerf
>> Google
>> 1818 Library Street, Suite 400
>> Reston, VA 20190
>> 202-370-5637
>> vint at google.com
>>
>>
>>
>>
>> On Mar 20, 2009, at 7:00 AM, Alireza Saleh wrote:
>>
>>> I don't see why we should not just let the registry have the  
>>> authority
>>> to do this? If you want to disallow this at the protocol level, you
>>> should also consider  disallowing the Low rise 'U+005F'  and
>>> Hyphen-minus U+002D because these have also the same shape as  
>>> Tatweel
>>> specially when they come in between of non-joining characters. My
>>> opinion is to limit protocol prohibitions to absolutely necessary  
>>> cases.
>>>
>>> Alireza
>>>
>>> Mark Davis wrote:
>>>> I propose that we make U+0640 ( ‎ـ‎ ) ARABIC TATWEEL (aka kash 
>>>> ida) be
>>>> DISALLOWED, adding it to
>>>> http://tools.ietf.org/html/draft-ietf-idnabis- 
>>>> tables-05#section-2.6.
>>>> Currently it is PVALID, but it does not carry semantics by any
>>>> Arabic-Script orthography, and its only value is for spoofing.
>>>>
>>>> For example: جوجل can be written with extra kashidas as  
>>>> جـوجل or as
>>>> جوجـل by inserting a kashida after the first or third characte 
>>>> r. This
>>>> is very hard for users to detect. We added it to Unicode for use in
>>>> manual justification, but has no place in IDNA.
>>>>
>>>> (http://en.wikipedia.org/wiki/Kashida,
>>>> http://unicode.org/cldr/utility/character.jsp?a=0640)
>>>>
>>>> Mark
>>>> _______________________________________________
>>>> Idna-update mailing list
>>>> Idna-update at alvestrand.no
>>>> http://www.alvestrand.no/mailman/listinfo/idna-update
>>>>
>>>
>>> _______________________________________________
>>> Idna-update mailing list
>>> Idna-update at alvestrand.no
>>> http://www.alvestrand.no/mailman/listinfo/idna-update
>>
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update


More information about the Idna-update mailing list