Comments on IDNA Bidi

Mark Davis mark.davis at icu-project.org
Thu Jan 17 03:15:25 CET 2008


A quick comment

1. This isn't a problem if the left-to-right labels are
> ASCII labels, because bc=ET (most notably, "#", "%", and "$")
> aren't allowed in domain names, anyway.


In domain names alone, but in URL you have have gorp surrounding the host
name, eg

http://abc.def?ghi....
http://abc.def#ghi....
http://abc.def/ghi....
...


> So I'd suggest trying those steps:
>
> 1. Prohibit bc=ET and bc=CS totally in labels.
> 2. Prohibit bc=AN in LCat labels, and only allow them in
>   RCat labels, which would then be further constrained,
>   because they then could not start or terminate those
>   labels.


These steps look reasonable.  One additional possibility is that we could
place a restriction on the entire host name; that is, if there are any BIDI
characters in the host name, then certain restrictions could apply to even
all ASCII labels. I mention that as a possibility; Harald can say if that is
realistic or not.


>
>
> Then see how many specific patterns remain in your test,
> given those constraints.
>
> --Ken
>
>
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>



-- 
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20080116/8c91f7fe/attachment.html


More information about the Idna-update mailing list