Mixing of AN and EN (Re: Protocol-08 (and status of Defs-04 andRationale-06))

Erik van der Poel erikv at google.com
Tue Dec 16 04:27:28 CET 2008


Oh, and % of course. This is also ET and therefore problematic.

Erik

On Mon, Dec 15, 2008 at 7:24 PM, Erik van der Poel <erikv at google.com> wrote:
> Hi Martin,
>
> Back when I was looking into this for IDNA2008, I looked into IRIs a
> bit too. The only standard delimiter that would pose a problem is #
> (U+0023), which has bidi property ET, which is disallowed in IDNA2008
> rules. The delimiters I looked into were :/@.;?=&# as in:
>
> scheme://user:password@host.com:port/path.txt;params=abc?query=foo&bar=blah#fragment
>
> Do you agree that # is the only problematic one? Or did you have other
> reasons to believe that LTR is a MUST?
>
> Of course, if a server uses other delimiters in its URIs, all bets are
> off. E.g. $
>
> Erik
>
> On Mon, Dec 15, 2008 at 6:18 PM, Martin Duerst <duerst at it.aoyama.ac.jp> wrote:
>> At 04:45 08/12/16, Harald Alvestrand wrote:
>>>Alireza Saleh wrote:
>>>> Hi Erik,
>>>>
>>>>
>>>> The latest news we received in this case is that Mark is going to look
>>>> at it and will write a proposal and send it for public view. There is no
>>>> exact time for that yet.
>>>> Correction of this bug may not affect the -bidi rules directly, but it
>>>> is related to the display of  L AN characters in a paragraph. However
>>>> this
>>>> effort may continue by UTC and application-providers to improve the
>>>> display of AN and AL characters. When I look at the recent
>>>> improvements of
>>>> displaying AN,AL,R characters, I believe there will be no visual
>>>> confusion
>>>> when you have AN,AL,R characters in a LTR contexts such as domains.
>>>What is your reason to believe that domains are an LTR context?
>>>
>>>The idea that domain names may occur in free text has been a basic
>>>assumption behind the bidi work. If they didn't, the document would be a
>>>lot shorter.
>>
>> There is no assumption of LTR context for domain names. However,
>> the IRI spec REQUIRES the equivalent of LTR context for IRIs.
>> The MUST is probably too strong, because it's very difficult to
>> guarantee in practice, but if you don't have that, there's no
>> guarantee that an IRI containing components with LTR characters
>> and components with RTL characters displays consistently.
>>
>> Regards,    Martin.
>>
>>
>>
>> #-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
>> #-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst at it.aoyama.ac.jp
>>
>>
>


More information about the Idna-update mailing list