Mixing of AN and EN (Re: Protocol-08 (and status of Defs-04 andRationale-06))

Erik van der Poel erikv at google.com
Tue Dec 16 04:24:36 CET 2008


Hi Martin,

Back when I was looking into this for IDNA2008, I looked into IRIs a
bit too. The only standard delimiter that would pose a problem is #
(U+0023), which has bidi property ET, which is disallowed in IDNA2008
rules. The delimiters I looked into were :/@.;?=&# as in:

scheme://user:password@host.com:port/path.txt;params=abc?query=foo&bar=blah#fragment

Do you agree that # is the only problematic one? Or did you have other
reasons to believe that LTR is a MUST?

Of course, if a server uses other delimiters in its URIs, all bets are
off. E.g. $

Erik

On Mon, Dec 15, 2008 at 6:18 PM, Martin Duerst <duerst at it.aoyama.ac.jp> wrote:
> At 04:45 08/12/16, Harald Alvestrand wrote:
>>Alireza Saleh wrote:
>>> Hi Erik,
>>>
>>>
>>> The latest news we received in this case is that Mark is going to look
>>> at it and will write a proposal and send it for public view. There is no
>>> exact time for that yet.
>>> Correction of this bug may not affect the -bidi rules directly, but it
>>> is related to the display of  L AN characters in a paragraph. However
>>> this
>>> effort may continue by UTC and application-providers to improve the
>>> display of AN and AL characters. When I look at the recent
>>> improvements of
>>> displaying AN,AL,R characters, I believe there will be no visual
>>> confusion
>>> when you have AN,AL,R characters in a LTR contexts such as domains.
>>What is your reason to believe that domains are an LTR context?
>>
>>The idea that domain names may occur in free text has been a basic
>>assumption behind the bidi work. If they didn't, the document would be a
>>lot shorter.
>
> There is no assumption of LTR context for domain names. However,
> the IRI spec REQUIRES the equivalent of LTR context for IRIs.
> The MUST is probably too strong, because it's very difficult to
> guarantee in practice, but if you don't have that, there's no
> guarantee that an IRI containing components with LTR characters
> and components with RTL characters displays consistently.
>
> Regards,    Martin.
>
>
>
> #-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
> #-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst at it.aoyama.ac.jp
>
>


More information about the Idna-update mailing list