bidi spec

Harald Alvestrand harald at alvestrand.no
Thu Feb 7 04:16:11 CET 2008


Erik van der Poel skrev:
> OK, I have tested the "remain grouped" property up to 8 characters
> using a single machine, and I have tested the "no two labels display
> the same" property up to 6 characters (with a dot on either side of
> it), and found that the following rules can be removed:
>
> If an R, AL or AN is present, no L may be present.
>   
I'm really surprised this fell out, since it's the one that prohibits
mixed direction labels.
> If an AN is present, no EN may be present
> If an AN is present, at least one R or AL must be present
>   
This is "superseded" by the requirement that AN can't begin a label.
> Note that the 2nd one above is the same as:
>
> If an EN is present, no AN may be present
>
> so the former is redundant and can be removed.
>   
It's probably a good idea to state those two as a single rule, in a
symmetrical way.
> However, I found that I have to add the following rule in order to
> satisfy the "no two labels display the same" property:
>   
Great that you were able to test it, and that there were no other cases!
> If there is an EN or AN present, there may not be an NSM
>
> Without this rule, we get the following behavior under ltr:
>
> R NSM EN -> EN NSM R
> R EN NSM -> EN NSM R
>
> Does this make sense?
>   
Hm. That makes no sense to me. And I get the same results.

The combining mark needs to stay on the character it's being used with;
at least one of those results has the mark going on the wrong character
(although the idea of English numbers with combining marks is kind of
weird).

This may be a bug in the BIDI algorithm.

Outlawing NSM would make the whole exercise futile for the stated
purpose of allowing Dhivehi and Yiddish. So we'd better figure out
what's going on here.

                          Harald


More information about the Idna-update mailing list