Difference between EN and AN (Re: Follow-up to Monday's discussion of digits)

Eric Brunner-Williams ebw at abenaki.wabanaki.net
Sun Nov 30 17:23:15 CET 2008


Harald,

Is the specification for the unstable string "L L AN CS R" or is it "L L 
AN (followed by any character type)"?

If the problem is L AN CS R, than that's one rule.
If the problem is * L * AN *, or * AN * L *, that's another rule.

Given some 63 characters between a CS-pair, are all sequences which 
contain L and which contain AN unstable?

If the sequences between the adjacent CS-pairs to the sequence which 
contains both L and AN, do not contain R, does the same answer hold?

I'm also curious why the current -bidi contains Rule #4 in Section 2.  
There's a bug we have to work around, I want to be sure I understand the 
required size of the detour. Do you recall off-hand when it first appeared?

Eric

Harald Alvestrand wrote:
> According to my stability calculator:
>
> the string "L L AN" is unstable according to -bidi, because in the 
> context CS L L AN CS R (left-to-right context), it resequences in the 
> order 1 2 3 6 5 4 , or CS L L R CS AN.
>
> Thus, the label "ab<arabic 1>", followed by a label containing "<arabic 
> letter>", will exhibit surprising behaviour - the number will jump over 
> the dot.
>
> the string "L L EN" exhibits no such behaviour.
>
> I think that's the reason why -bidi currently prohibits mixing L with AN.
>
>                       Harald
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
>
>   


More information about the Idna-update mailing list