Comments on IDNA Bidi

Mark Davis mark.davis at
Thu Jan 17 03:15:25 CET 2008

A quick comment

1. This isn't a problem if the left-to-right labels are
> ASCII labels, because bc=ET (most notably, "#", "%", and "$")
> aren't allowed in domain names, anyway.

In domain names alone, but in URL you have have gorp surrounding the host
name, eg


> So I'd suggest trying those steps:
> 1. Prohibit bc=ET and bc=CS totally in labels.
> 2. Prohibit bc=AN in LCat labels, and only allow them in
>   RCat labels, which would then be further constrained,
>   because they then could not start or terminate those
>   labels.

These steps look reasonable.  One additional possibility is that we could
place a restriction on the entire host name; that is, if there are any BIDI
characters in the host name, then certain restrictions could apply to even
all ASCII labels. I mention that as a possibility; Harald can say if that is
realistic or not.

> Then see how many specific patterns remain in your test,
> given those constraints.
> --Ken
> _______________________________________________
> Idna-update mailing list
> Idna-update at

-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Idna-update mailing list