Mixing of AN and EN (Re: Protocol-08 (and status of Defs-04 and Rationale-06))
Harald Tveit Alvestrand
harald at alvestrand.no
Tue Dec 16 07:16:11 CET 2008
Mark Davis skrev:
> It is hard to tell from your code, since it depends on what the
> evaluation of some of the subfunctions would yield.
>
> I'll list instead a four test cases that you can check. I hasten to
> add that I haven't absolutely checked these yet.
> AN N AN → AN R AN
> R AN N EN → R AN R EN
> R EN N AN → R EN R AN
>
> R EN N EN → R EN R EN
>
> That is, the N in each of these cases would change to R.
>
> The first two rules are when you have a neutral between two
> Arabic-Indic digits. For example, if you have U+0668 + "!?" + U+0669,
> then the display ordering of those three should be
> U+0669 ! ? U+0668.
> That is, ٨ followed by ! followed by ٩ should appear from right to
> left. In my emailer this works. From left to right I see the Arabic 9,
> then ? then !, then the Arabic 8.
>
> ٨!?٩
>
> The latter two rules are in effect if an EN remains in the text, eg if
> an English number follows an Arabic letter and W7 has not been evoked.
> The , a ج followed by 8 followed by ! followed by 9 should all
> appear RTL.
>
> ج8!?9
>
> That case fails in my emailer.
>
> *Background:*
>
> Here is the text: http://unicode.org/reports/tr9/#N1
>
> The issue is that the text says that AN and EN act like R, and then
> has a set of rules. Those rules don't explicitly list all of the
> combinations of R, EN, AN on both sides of an N. That would add a 4
> more rules, those added in yellow below.
>
> L N L → L L L
> R N R → R R R
> R N AN → R R AN
> R N EN → R R EN
> AN N R → AN R R
> AN N AN → AN R AN
>
> AN N EN → AN R EN
> EN N R → EN R R
> EN N AN → EN R AN
>
> EN N EN → EN R EN
>
>
> If someone interpreted the rules as being complete, then they would
> neglect to change neutrals into R in those 4 cases.
I interpreted the 4 examples given as test cases I could verify against,
and implemented what the text says. So my code "passed" - that is, all
of those strings were displayed right-to-left, even when the embedding
direction was L.
Harald
More information about the Idna-update
mailing list