Difference between EN and AN (Re: Follow-up to Monday's discussion of digits)

Harald Tveit Alvestrand harald at alvestrand.no
Sun Nov 30 20:57:33 CET 2008


Eric Brunner-Williams skrev:
> Harald,
>
> Is the specification for the unstable string "L L AN CS R" or is it "L 
> L AN (followed by any character type)"?
>
> If the problem is L AN CS R, than that's one rule.
> If the problem is * L * AN *, or * AN * L *, that's another rule.
The specific string I tested was L L AN CS R. The way my test program 
works, I gave it "L L AN", and it found the string that would cause an 
issue. I can't answer the question of "all patterns that cause an 
issue", but "L AN CS R" also causes it.
>
> Given some 63 characters between a CS-pair, are all sequences which 
> contain L and which contain AN unstable?
No. L AN L is stable. Do you have an use case for such a string? 
(remembering John Klensin's exhortation that our task is not to permit 
everything that can be permitted, but to make an useful set of 
identifiers available to as many languages as possible)?
>
> If the sequences between the adjacent CS-pairs to the sequence which 
> contains both L and AN, do not contain R, does the same answer hold?
I don't know, and don't think I should care, since the WG has gone out 
of its way to forbid inter-label tests.
>
> I'm also curious why the current -bidi contains Rule #4 in Section 2.  
> There's a bug we have to work around, I want to be sure I understand 
> the required size of the detour. Do you recall off-hand when it first 
> appeared?
Rule #4 was added when I discovered that there were unstable 
combinations of L and AN. Versions of the rule have appeared in every 
version since draft-alvestrand-idna-bidi-03, January 2008 (the one where 
the rules first became completely explicit).

I don't know what you mean by "a bug we have to work around"; I regard 
UAX#9, the Unicode BIDI algorithm, as "facts we have to work with", and 
the rule is a consequence of the requirement + those facts. As usual, I 
do exhort you to make one of two explicit statements:

- The requirements (now in section 3 of -bidi-03) are wrong, and here is 
some new suggested text for the requirements, and the rules they imply:
- The rules (now in section 2 of -bidi-03) can be changed without 
violating the requirements in section 3, and here's the new rule:

We've been at this too long to make handwaving statements like "there's 
a bug we have to work around". Either you think the requirements are 
wrong, you think the rules are wrong, or there are no changes needed. Be 
explicit.

                          Harald


More information about the Idna-update mailing list