my comments on draft-ietf-idnabis-bidi-05
Harald Alvestrand
harald at alvestrand.no
Tue Sep 8 21:32:03 CEST 2009
Omitting BN from LTR labels was a mistake that crept in between -03 and
-04. In order to allow the use of ZWNJ/ZWJ with Indic scripts, BN should
definitely be allowed in LTR labels.
I'll fix it in -05.
Hm.... interestingly, my tests never tested what happens to a BN in the BIDI
algorithm; since paragraph X9 of the BIDI algorithm specification said
to ignore
BN, I simply omitted it from my test strings. So I have no idea whether
a BN at the
end of a label will jump over delimiters or not - or even if the
question is meaningful in the context of the Unicode BIDI algorithm. (am
writing this on a plane, so can't check).
The present formulation has the interesting effect that BN is now forbidden
at the beginning and end of strings, which was not true in -03. I think
that is an improvement (it outlaws strings like "BN EN", which seems to
have been permitted by the -03 rule), but is one that I didn't make
consciously.
What does the group think?
Harald
Martin J. Dürst wrote:
> On 2009/09/08 0:12, John C Klensin wrote:
>
>> --On Monday, September 07, 2009 4:11 PM +0900 "\"Martin J.
>> Dürst\""<duerst at it.aoyama.ac.jp> wrote:
>>
>>
>>> Hello Mati,
>>>
>>> On 2009/09/07 15:47, Matitiahu Allouche wrote:
>>>
>>>> On October first, Martin J. Dürst asked:
>>>> conditions 2/4: Why are BN (control characters) allowed in
>>>> RTL but not in LTR?
>>>>
>>>> BN characters are invisible and should be banned as allowing
>>>> phishing and violating the Label Uniqueness requirement.
>>>> However, ZWJ and ZWNJ are classified as BN, and ZWNJ is
>>>> required for the proper orthography of Persian which is
>>>> written with the Arabic script, hence BNs are allowed in RTL
>>>> labels.
>>>>
>>> That makes a lot of sense. But then shouldn't BN also be
>>> allowed for LTR, because some of these characters are needed
>>> in Indic scripts?
>>>
>> Remember that ZWJ and ZWNJ are allowed by exception, not because
>> they are BN, and that they are classified as CONTEXTJ, not as
>> DISALLOWED. If we continue with that model --and no one has
>> argued recently that we should not-- then the relevant question
>> for ZWJ/ZWNJ is whether the contextual rules are correctly
>> applied to the scripts in which they are needed
>>
>
> This is the question for Tables. I haven't had time to read Tables
> during last call, but I'm assuming it's doing the right things on this
> issue.
>
>
>> and not about their membership in BN.
>>
>
> Yes, what we want, ideally, is that all the exceptions "just work" (in
> the sense that they pass the bidi tests) in those contexts where they
> are allowed.
>
> The current Bidi document is written in terms of bidi categories, and so
> to get ZWJ/ZWNJ to "just work", we have to include their bidi category,
> namely BN, where relevant. The current Bidi document gets there half-way
> (or you can say three-fourths) by allowing BN in RTL labels. I proposed
> (and continue to propose!) that we fix this "half-way" state by allowing
> BN also in LTR labels. This will eliminate some strange edge cases
> (currently, any Arabic script label can be combined with any Indic
> script label, *except if the later contains a ZWJ immediately after a
> virama* (see
> http://tools.ietf.org/html/draft-ietf-idnabis-tables-06#appendix-A.2)).
>
>
> Allowing BN also in LTR labels is the easiest fix for the current
> situation. Other fixes, which potentially fix larger problems, are also
> possible. One of them is to not mention BN at all in the Bidi document,
> and just refer to "exceptionally allowed characters in the tables
> document". This would cover the case where in the future we need some
> exception from another bidi category. But it would mean that we have to
> carefully vet that exception also for bidi issues. That's just a 'todo'
> item on somebody's todo list (whoever will take care of exceptions when
> they occur), but it's something not to forget.
>
>
>
>> If anything in Bidi confuses that, or
>> confuses the more general principle that it does not override
>> Tables, I would think it needs to be fixed... but I haven't seen
>> anything that I read as such confusion.
>>
>
> I definitely never have concluded such a thing.
>
> Regards, Martin.
>
>
More information about the Idna-update
mailing list