New version of strawman for IDNAv2

Patrik Fältström patrik at frobbit.se
Fri Feb 27 12:04:08 CET 2009


On 27 feb 2009, at 11.31, Alireza Saleh wrote:

> John C Klensin wrote:
>> --On Thursday, February 26, 2009 19:46 -0800 Paul Hoffman
>> <phoffman at imc.org> wrote:
>>
>>
>>> This tees into John's recent thread on parsing the issues and
>>> finding a middle ground. I have included many of the
>>> suggestions from the mailing list and off-line responses. Most
>>> significantly, I have changed ZWNJ and ZWJ from "mapped to
>>> nothing" to being allowed so that Arabic labels will be more
>>> realistic.
>>>
>>
>> Paul,
>>
>> With the understanding that I still don't believe this is the
>> right way to go, one technical correction and one issue:
>>
>> (1) ZWJ and ZWNJ are not needed for Arabic language orthography.
>> ZWNJ is needed for Persian languages and what are sometimes
>> called Indo-Arabic ones (e.g., Urdu, but there are _many_
>> others).  Both ZWJ and ZWNJ are needed for several of the Indic
>> scripts and associated languages (although slightly fewer with
>> Unicode 5.1 than with Unicode 3.2).
>>
> Having the ZWNJ,ZWJ is mandatory, but allowing them without any
> condition will cause creation of  many confusing names, however, i  
> think
> it should be allowed without any condition within the protocol and  
> each
> registry which likes to support Persian or Indo-Arabic languages  
> should
> take care of handling the confusions.

Can you please look at (2) below, and say which one of the  
alternatives (i) to (iv) you prefer?

>> (2) When one considers the number of registries/zones on the
>> Internet or even those that exist only at the second level
>> (i.e., maintaining registrations for third-level names), it is
>> certain that some of them will be operated by people with bad
>> intentions.  Given that, are you confident that ZWJ/ZWNJ can
>> simply be treated as ordinary characters, relying on the
>> registries to prevent those characters where they would be fully
>> invisible?
>>
>> When faced with that question very early in the IDNA2008 design
>> process, we concluded that there were four possible answers:
>>
>> 	(i) Yes, we trust the registries and are willing to live
>> 	with labels like "ábc" failing to compare equal to
>> 	"áb<ZWJ>c" despite looking exactly the same when
>> 	displayed by normal rendering software.
>> 	
>> 	(ii) We don't quite trust the registries but are
>> 	confident that all rendering software, on all operating
>> 	systems, that encounter strings like "áb<ZWJ>c" will
>> 	get upset in sufficient vivid ways to warn the user off.
>> 	We didn't think rendering ZWJ as a little box or
>> 	question mark would be adequate for that case because it
>> 	might be a legitimate character for which no font was
>> 	available even though it would at least not be confused
>> 	with "ábc".
>> 	
>> 	(iii) We either leave things as they are in IDNA2003
>> 	(map to nothing) or simply ban the character.  Either
>> 	one puts the scripts that need one or both of these
>> 	characters at an intolerable disadvantage.
>> 	
>> 	(iv) We adopt some sort of "contextual rule" model,
>> 	despite the complexity it adds.
>>
>> Obviously, we chose the fourth.  We did so because we didn't
>> believe the assumptions that (i) or (ii) implied and did not
>> consider (iii) to be acceptable given the number of people who
>> use the relevant scripts.   As I read your document, you are
>> proposing (i).   Is that correct and, if so, could you explain a
>> bit better how you see the tradeoffs?
>>
>> Please also note that, if you permit ZWJ and/or ZWNJ as
>> characters, we end up in exactly the same situation that you and
>> others have objected to with Eszett and Final Sigma, i.e., an
>> input string that converts to a different A-label in IDNA2003
>> and IDNA2008.  I'm prepared to live with that but, to the degree
>> to which you consider it a problem so serious as to require
>> rechartering and a completely different document strategy, I'd
>> like to better understand the exception and its implications.
>> In particular, I don't see the section of your outline document
>> that discussed the transition strategy that many people (I think
>> including you, but could be wrong about that) have argued is
>> absolutely essential if there are going to be any
>> incompatibilities of that sort.

Patrik

>>
>>
>> best,
>>   john
>>
>> _______________________________________________
>> Idna-update mailing list
>> Idna-update at alvestrand.no
>> http://www.alvestrand.no/mailman/listinfo/idna-update
>>
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update



More information about the Idna-update mailing list