Comments on bidi-04
Harald Tveit Alvestrand
harald at alvestrand.no
Tue Mar 11 13:57:05 CET 2008
--On Monday, March 10, 2008 18:24:57 -0700 Mark Davis
<mark.davis at icu-project.org> wrote:
> Shouting doesn't help. Supplying an actual test case where LRE makes a
> difference would help.
> You could well be right, but I'd simply like to see the test case to see
> what is going on, since it doesn't square with my understanding.
(one hour's Perl hacking later - nice way to start the morning)
It turns out that it does make a difference, but it does not matter....
For instance, the string AN EN, when embedded between 2 dots, will behave
as follows in a LTR context:
<RLE><PDF>.<AN><EN>.<RLE><PDF> reorders into <AN><EN>.. (that is, both dots
jump to the right end of the string). (I hope you agree with me that "sor"
and "eor" will both be "R" for the run that goes .<AN><EN>., while the
embedding direction is L).
This affects 3 of the 53 length-2 combinations allowed by IDNA2003, and 43
of the 375 length-3 combinations allowed by IDNA2003.
But - ALL of the strings affected turn out to be eliminated by our
currently proposed set of selection criteria. If a string passes the test
for "safe BIDI label" in bidi-04, it will not be affected by bidi
formatting codes in the text around it.
So we can eliminate the restriction.
More information about the Idna-update