Comments on IDNA Bidi
mark.davis at icu-project.org
Thu Jan 17 03:15:25 CET 2008
A quick comment
1. This isn't a problem if the left-to-right labels are
> ASCII labels, because bc=ET (most notably, "#", "%", and "$")
> aren't allowed in domain names, anyway.
In domain names alone, but in URL you have have gorp surrounding the host
> So I'd suggest trying those steps:
> 1. Prohibit bc=ET and bc=CS totally in labels.
> 2. Prohibit bc=AN in LCat labels, and only allow them in
> RCat labels, which would then be further constrained,
> because they then could not start or terminate those
These steps look reasonable. One additional possibility is that we could
place a restriction on the entire host name; that is, if there are any BIDI
characters in the host name, then certain restrictions could apply to even
all ASCII labels. I mention that as a possibility; Harald can say if that is
realistic or not.
> Then see how many specific patterns remain in your test,
> given those constraints.
> Idna-update mailing list
> Idna-update at alvestrand.no
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Idna-update