bidi spec

Erik van der Poel erikv at google.com
Thu Feb 7 20:04:13 CET 2008


On Feb 7, 2008 12:09 AM, Harald Alvestrand <harald at alvestrand.no> wrote:
> Mark Davis skrev:
> >    2. The only characters allowed in NSM are [:Mn:], and [:Me:]. The
> >       protocol forbits [:Me:] entirely, and forbids [:Mc:] in first
> >       position. So #6 is redundant. If it is retained, we should at
> >       least have a note about that.
>
> I'm not aware of an Unicode stability guarantee that guarantees the
> property stated above, and it is in fact false, you have 10 characters
> in class "Me" that are NSM too. (These aren't allowed in labels either,
> according to tables-03, btw - but I have no idea what an ARABIC START OF
> RUB EL HIZB is, or whether someone will step up tomorrow and demand an
> exceptioon for it.)
>
> Since the rest of the document is stated only in terms of BIDI
> properties, I'd like to keep it stated in terms of BIDI properties. The
> current formulation has all requirements that derive from the BIDI
> properties in this document, and makes no assumptions on what the other
> documents say; that's a Good Thing in my opinion.

I agree that the IDNA200X bidi spec should be in terms of the bidi
properties. The premise of the spec is that domain names are displayed
using the Unicode bidi algorithm. If it is true that many
implementations use that algorithm, then the premise is sound.

If the IDNA bidi spec is *not* based on the bidi properties, it
becomes very difficult (if not impossible) to reason about the spec.

This does of course raise the question of the stability guarantee of
the Unicode bidi properties. Is there a guarantee or a plan to
establish one?

Erik


More information about the Idna-update mailing list