Request for updated example highlighting problem of mixing of AN and EN
james.mitchell at ausregistry.com.au
Wed Aug 19 02:49:59 CEST 2009
With that example I get..
Bidi Class: CS R EN AN ES EN CS
Resolved: e R EN AN ON EN e
Level: 1 2 2 1 2
Display: ON EN ES EN AN R ON
My concern is the online tool provided by Unicode
rearranged the characters to something other than your example. This label
appears fine to me; I believe it satisfies the label uniqueness test,
remembering a label cannot begin with an EN. Note that switching the R for
an AL yields the same display order.
I do not understand how rule N1 was unclear, but everyone is different. I
was not aware of inconsistency in this case among applications, however believe
this is a moot point; we should not be designing this protocol to work around
problems in applications.
So what is the issue here?
> -----Original Message-----
> From: Alireza Saleh [mailto:saleh at nic.ir]
> Sent: Tuesday, 18 August 2009 7:54 PM
> To: James Mitchell
> Cc: idna-update at alvestrand.no
> Subject: Re: Request for updated example highlighting problem of mixing of AN
> and EN
> There is at least one example which has been sent by Harald that is, "
> CS R EN AN ES EN CS (.<alef><latin 1><arabic 1>-<latin 1>.) will
> rearrange into the same sequence as CS R EN ES EN AN CS (.<alef><latin
> 1>-<latin 1><arabic 1>.) "
> The specifications of the rule N1 of UAX#9 is not so clear and this
> causes some some inconsistency among the different applications
> implementing this rule. This has been reported to Unicode and at that
> time I believed by well interpreting the N1 rule and having
> clarification examples there is nothing to be worried about by mixing AN
> and EN, I think the current change draft of UAX#9 is trying to fix the
> bug according to the implementations and not interpreting the text
> correctly however we can implement the W2 rule of UAX#9 which says :
> ' W2. Search backward from each instance of a European number until the
> first strong type (R, L, AL, or sor) is found. If an AL is found, change
> the type of the European number to Arabic number.' or simply we can say
> by having no R in the bidi label we can mix AN and EN.
> The UAX#31 has been implemented for using ZWNJ in Arabic-Script.
> Thank you, Erik!
> James Mitchell wrote:
> > The only concrete example I have found that justifies the prohibition of
> > mixing AN and EN is CS EN AN CS R in an LTR context.
> > [http://www.alvestrand.no/pipermail/idna-update/2008-January/000858.html]
> > The current bidi rules, plus changes from a subsequent email from Mark, an
> > AN will require the label to be treated as RTL
> > [http://www.alvestrand.no/pipermail/idna-update/2009-August/005153.html].
> > Therefore, a label mixing AN and EN will be treated as an RTL label. The
> > above example (EN AN) will violate the first bidi rule, that label must
> > begin with L, R or AL.
> > Is there a concrete example that is otherwise IDNA-valid?
> > From my understanding of the bidirectional algorithm and the current bidi
> > rules, there is no otherwise covered case where mixing AN and EN leads to a
> > label that violates the requirements (as distinct from the rules) of bidi.
> > stated earlier, a label containing an AN is an RTL label. An RTL label must
> > start with an AL or R (rule 1) and must contain only R, AL, AN, EN, ES, CS,
> > ET, ON, BN or NSM. Note that the only strong characters in this label are
> > AL and R (L is not allowed and sor is excluded because the first character
> > must be AL or R). Given that, no EN can resolve to an L
> > [http://unicode.org/reports/tr9/#W7], therefore all AN and EN will resolve
> > to the same levels.
> > Or perhaps I am missing something?
> > James Mitchell
> > _______________________________________________
> > Idna-update mailing list
> > Idna-update at alvestrand.no
> > http://www.alvestrand.no/mailman/listinfo/idna-update
More information about the Idna-update