bidi spec

Erik van der Poel erikv at google.com
Thu Feb 7 23:46:47 CET 2008


On Feb 6, 2008 11:01 PM, Mark Davis <mark.davis at icu-project.org> wrote:
> I don't think these are the minimal rules yet, since some *parts* of these
> rules can be removed.

I tried removing each of the parts of the rules, and each removal made
the tests fail. So I think we have a minimal set, if not *the* minimal
set.

One of the removals actually required bumping up the total number of
characters from 5 to 6 to see the effect (the zero or more NSMs at the
end rule).

I have now run it with 9 characters total (for "remain grouped") and 7
characters for "no two labels display the same".

In the interests of full disclosure (and to explain how I was able to
run 9 characters on a single machine), I should mention that I did not
loop through every Unicode codepoint. Instead I chose representatives
from L, R, AL, EN, ES, ON, NSM and AN (but not BN), using *different*
representatives for the labels named A, L and D in the spec. This
assumes that the ICU implementation of the bidi algorithm only uses
the bidi properties. I did not turn on the option that does mirroring.
(Parentheses and such are not allowed in IDNs anyway.)

Erik


More information about the Idna-update mailing list