Request for updated example highlighting problem of mixing of AN and EN

Alireza Saleh saleh at
Wed Aug 19 13:26:42 CEST 2009

Look at this


James Mitchell wrote:
> With that example I get..
> Bidi Class: CS  R EN AN ES EN CS
> Resolved:    e  R EN AN ON EN  e
> Level:          1  2  2  1  2
> Display:    ON EN ES EN AN  R ON
> My concern is the online tool provided by Unicode
> []
> rearranged the characters to something other than your example.  This label
> appears fine to me; I believe it satisfies the label uniqueness test,
> remembering a label cannot begin with an EN.  Note that switching the R for
> an AL yields the same display order.
> I do not understand how rule N1 was unclear, but everyone is different.  I
> was not aware of inconsistency in this case among applications, however believe
> this is a moot point; we should not be designing this protocol to work around
> problems in applications.
> So what is the issue here?
> James
>> -----Original Message-----
>> From: Alireza Saleh [mailto:saleh at]
>> Sent: Tuesday, 18 August 2009 7:54 PM
>> To: James Mitchell
>> Cc: idna-update at
>> Subject: Re: Request for updated example highlighting problem of mixing of AN
>> and EN
>> There is at least one example which has been sent by Harald that is, "
>> CS R EN AN ES EN CS (.<alef><latin 1><arabic 1>-<latin 1>.) will
>> rearrange into the same sequence as CS R EN ES EN AN CS (.<alef><latin
>> 1>-<latin 1><arabic 1>.) "
>> The specifications of the rule N1 of UAX#9 is not so clear and this
>> causes some some inconsistency among the different applications
>> implementing this rule. This has been reported to Unicode and at that
>> time I believed by well interpreting the N1 rule and having
>> clarification examples there is nothing to be worried about by mixing AN
>> and EN, I think the current change draft of UAX#9 is trying to fix the
>> bug according to the implementations and not  interpreting the text
>> correctly however we can implement the W2 rule of UAX#9  which says :
>> ' W2. Search backward from each instance of a European number until the
>> first strong type (R, L, AL, or sor) is found. If an AL is found, change
>> the type of the European number to Arabic number.' or simply we can say
>> by having no R in the bidi label we can mix AN and EN.
>> The UAX#31 has been implemented for using ZWNJ in Arabic-Script.
>> Alireza
>> Thank you, Erik!
>> James Mitchell wrote:
>>> The only concrete example I have found that justifies the prohibition of
>>> mixing AN and EN is CS EN AN CS R in an LTR context.
>>> []
>>> The current bidi rules, plus changes from a subsequent email from Mark, an
>>> AN will require the label to be treated as RTL
>>> [].
>>> Therefore, a label mixing AN and EN will be treated as an RTL label.  The
>>> above example (EN AN) will violate the first bidi rule, that label must
>>> begin with L, R or AL.
>>> Is there a concrete example that is otherwise IDNA-valid?
>>> From my understanding of the bidirectional algorithm and the current bidi
>>> rules, there is no otherwise covered case where mixing AN and EN leads to a
>>> label that violates the requirements (as distinct from the rules) of bidi.
>> As
>>> stated earlier, a label containing an AN is an RTL label.  An RTL label must
>>> start with an AL or R (rule 1) and must contain only R, AL, AN, EN, ES, CS,
>>> ET, ON, BN or NSM.  Note that the only strong characters in this label are
>>> AL and R (L is not allowed and sor is excluded because the first character
>>> must be AL or R).  Given that, no EN can resolve to an L
>>> [], therefore all AN and EN will resolve
>>> to the same levels.
>>> Or perhaps I am missing something?
>>> James Mitchell
>>> _______________________________________________
>>> Idna-update mailing list
>>> Idna-update at
> _______________________________________________
> Idna-update mailing list
> Idna-update at

More information about the Idna-update mailing list