Difference between EN and AN (Re: Follow-up to Monday's discussion of digits)
Harald Tveit Alvestrand
harald at alvestrand.no
Sun Nov 30 20:57:33 CET 2008
Eric Brunner-Williams skrev:
> Harald,
>
> Is the specification for the unstable string "L L AN CS R" or is it "L
> L AN (followed by any character type)"?
>
> If the problem is L AN CS R, than that's one rule.
> If the problem is * L * AN *, or * AN * L *, that's another rule.
The specific string I tested was L L AN CS R. The way my test program
works, I gave it "L L AN", and it found the string that would cause an
issue. I can't answer the question of "all patterns that cause an
issue", but "L AN CS R" also causes it.
>
> Given some 63 characters between a CS-pair, are all sequences which
> contain L and which contain AN unstable?
No. L AN L is stable. Do you have an use case for such a string?
(remembering John Klensin's exhortation that our task is not to permit
everything that can be permitted, but to make an useful set of
identifiers available to as many languages as possible)?
>
> If the sequences between the adjacent CS-pairs to the sequence which
> contains both L and AN, do not contain R, does the same answer hold?
I don't know, and don't think I should care, since the WG has gone out
of its way to forbid inter-label tests.
>
> I'm also curious why the current -bidi contains Rule #4 in Section 2.
> There's a bug we have to work around, I want to be sure I understand
> the required size of the detour. Do you recall off-hand when it first
> appeared?
Rule #4 was added when I discovered that there were unstable
combinations of L and AN. Versions of the rule have appeared in every
version since draft-alvestrand-idna-bidi-03, January 2008 (the one where
the rules first became completely explicit).
I don't know what you mean by "a bug we have to work around"; I regard
UAX#9, the Unicode BIDI algorithm, as "facts we have to work with", and
the rule is a consequence of the requirement + those facts. As usual, I
do exhort you to make one of two explicit statements:
- The requirements (now in section 3 of -bidi-03) are wrong, and here is
some new suggested text for the requirements, and the rules they imply:
- The rules (now in section 2 of -bidi-03) can be changed without
violating the requirements in section 3, and here's the new rule:
We've been at this too long to make handwaving statements like "there's
a bug we have to work around". Either you think the requirements are
wrong, you think the rules are wrong, or there are no changes needed. Be
explicit.
Harald
More information about the Idna-update
mailing list