my comments on draft-ietf-idnabis-bidi-05

Harald Alvestrand harald at alvestrand.no
Tue Sep 8 21:32:03 CEST 2009


Omitting BN from LTR labels was a mistake that crept in between -03 and 
-04. In order to allow the use of ZWNJ/ZWJ with Indic scripts, BN should 
definitely be allowed in LTR labels.

I'll fix it in -05.

Hm.... interestingly, my tests never tested what happens to a BN in the BIDI
algorithm; since paragraph X9 of the BIDI algorithm specification said 
to ignore
BN, I simply omitted it from my test strings. So I have no idea whether 
a BN at the
end of a label will jump over delimiters or not - or even if the 
question is meaningful in the context of the Unicode BIDI algorithm. (am 
writing this on a plane, so can't check).

The present formulation has the interesting effect that BN is now forbidden
at the beginning and end of strings, which was not true in -03. I think 
that is an improvement (it outlaws strings like "BN EN", which seems to 
have been permitted by the -03 rule), but is one that I didn't make 
consciously.

What does the group think?

                 Harald

Martin J. Dürst wrote:
> On 2009/09/08 0:12, John C Klensin wrote:
>   
>> --On Monday, September 07, 2009 4:11 PM +0900 "\"Martin J.
>> Dürst\""<duerst at it.aoyama.ac.jp>  wrote:
>>
>>     
>>> Hello Mati,
>>>
>>> On 2009/09/07 15:47, Matitiahu Allouche wrote:
>>>       
>>>> On October first, Martin J. Dürst asked:
>>>> conditions 2/4: Why are BN (control characters) allowed in
>>>> RTL but not in LTR?
>>>>
>>>> BN characters are invisible and should be banned as allowing
>>>> phishing and violating the Label Uniqueness requirement.
>>>> However, ZWJ and ZWNJ are classified as BN, and ZWNJ is
>>>> required for the proper orthography of Persian which is
>>>> written with the Arabic script, hence BNs are allowed in RTL
>>>> labels.
>>>>         
>>> That makes a lot of sense. But then shouldn't BN also be
>>> allowed for  LTR, because some of these characters are needed
>>> in Indic scripts?
>>>       
>> Remember that ZWJ and ZWNJ are allowed by exception, not because
>> they are BN, and that they are classified as CONTEXTJ, not as
>> DISALLOWED.  If we continue with that model --and no one has
>> argued recently that we should not-- then the relevant question
>> for ZWJ/ZWNJ is whether the contextual rules are correctly
>> applied to the scripts in which they are needed
>>     
>
> This is the question for Tables. I haven't had time to read Tables 
> during last call, but I'm assuming it's doing the right things on this 
> issue.
>
>   
>> and not about their membership in BN.
>>     
>
> Yes, what we want, ideally, is that all the exceptions "just work" (in 
> the sense that they pass the bidi tests) in those contexts where they 
> are allowed.
>
> The current Bidi document is written in terms of bidi categories, and so 
> to get ZWJ/ZWNJ to "just work", we have to include their bidi category, 
> namely BN, where relevant. The current Bidi document gets there half-way 
> (or you can say three-fourths) by allowing BN in RTL labels. I proposed 
> (and continue to propose!) that we fix this "half-way" state by allowing 
> BN also in LTR labels. This will eliminate some strange edge cases 
> (currently, any Arabic script label can be combined with any Indic 
> script label, *except if the later contains a ZWJ immediately after a 
> virama* (see 
> http://tools.ietf.org/html/draft-ietf-idnabis-tables-06#appendix-A.2)).
>
>
> Allowing BN also in LTR labels is the easiest fix for the current 
> situation. Other fixes, which potentially fix larger problems, are also 
> possible. One of them is to not mention BN at all in the Bidi document, 
> and just refer to "exceptionally allowed characters in the tables 
> document". This would cover the case where in the future we need some 
> exception from another bidi category. But it would mean that we have to 
> carefully vet that exception also for bidi issues. That's just a 'todo' 
> item on somebody's todo list (whoever will take care of exceptions when 
> they occur), but it's something not to forget.
>
>
>   
>> If anything in Bidi confuses that, or
>> confuses the more general principle that it does not override
>> Tables, I would think it needs to be fixed... but I haven't seen
>> anything that I read as such confusion.
>>     
>
> I definitely never have concluded such a thing.
>
> Regards,   Martin.
>
>   




More information about the Idna-update mailing list