New version of strawman for IDNAv2

Alireza Saleh saleh at nic.ir
Fri Feb 27 11:31:27 CET 2009


John C Klensin wrote:
> --On Thursday, February 26, 2009 19:46 -0800 Paul Hoffman
> <phoffman at imc.org> wrote:
>
>   
>> This tees into John's recent thread on parsing the issues and
>> finding a middle ground. I have included many of the
>> suggestions from the mailing list and off-line responses. Most
>> significantly, I have changed ZWNJ and ZWJ from "mapped to
>> nothing" to being allowed so that Arabic labels will be more
>> realistic.
>>     
>
> Paul,
>
> With the understanding that I still don't believe this is the
> right way to go, one technical correction and one issue:
>
> (1) ZWJ and ZWNJ are not needed for Arabic language orthography.
> ZWNJ is needed for Persian languages and what are sometimes
> called Indo-Arabic ones (e.g., Urdu, but there are _many_
> others).  Both ZWJ and ZWNJ are needed for several of the Indic
> scripts and associated languages (although slightly fewer with
> Unicode 5.1 than with Unicode 3.2).
>   
Having the ZWNJ,ZWJ is mandatory, but allowing them without any 
condition will cause creation of  many confusing names, however, i think 
it should be allowed without any condition within the protocol and each 
registry which likes to support Persian or Indo-Arabic languages should 
take care of handling the confusions.
> (2) When one considers the number of registries/zones on the
> Internet or even those that exist only at the second level
> (i.e., maintaining registrations for third-level names), it is
> certain that some of them will be operated by people with bad
> intentions.  Given that, are you confident that ZWJ/ZWNJ can
> simply be treated as ordinary characters, relying on the
> registries to prevent those characters where they would be fully
> invisible?
>
> When faced with that question very early in the IDNA2008 design
> process, we concluded that there were four possible answers:
>
> 	(i) Yes, we trust the registries and are willing to live
> 	with labels like "ábc" failing to compare equal to
> 	"áb<ZWJ>c" despite looking exactly the same when
> 	displayed by normal rendering software.
> 	
> 	(ii) We don't quite trust the registries but are
> 	confident that all rendering software, on all operating
> 	systems, that encounter strings like "áb<ZWJ>c" will
> 	get upset in sufficient vivid ways to warn the user off.
> 	We didn't think rendering ZWJ as a little box or
> 	question mark would be adequate for that case because it
> 	might be a legitimate character for which no font was
> 	available even though it would at least not be confused
> 	with "ábc".
> 	
> 	(iii) We either leave things as they are in IDNA2003
> 	(map to nothing) or simply ban the character.  Either
> 	one puts the scripts that need one or both of these
> 	characters at an intolerable disadvantage.
> 	
> 	(iv) We adopt some sort of "contextual rule" model,
> 	despite the complexity it adds.
>
> Obviously, we chose the fourth.  We did so because we didn't
> believe the assumptions that (i) or (ii) implied and did not
> consider (iii) to be acceptable given the number of people who
> use the relevant scripts.   As I read your document, you are
> proposing (i).   Is that correct and, if so, could you explain a
> bit better how you see the tradeoffs?
>
> Please also note that, if you permit ZWJ and/or ZWNJ as
> characters, we end up in exactly the same situation that you and
> others have objected to with Eszett and Final Sigma, i.e., an
> input string that converts to a different A-label in IDNA2003
> and IDNA2008.  I'm prepared to live with that but, to the degree
> to which you consider it a problem so serious as to require
> rechartering and a completely different document strategy, I'd
> like to better understand the exception and its implications.
> In particular, I don't see the section of your outline document
> that discussed the transition strategy that many people (I think
> including you, but could be wrong about that) have argued is
> absolutely essential if there are going to be any
> incompatibilities of that sort.
>
> best,
>    john
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>   



More information about the Idna-update mailing list