BIDI rules

Harald Alvestrand harald at alvestrand.no
Fri Sep 5 11:00:31 CEST 2008


Alireza Saleh wrote:
>
> I think the your client for viewing email did something to my original 
> email , I'm attaching images of my original examples.
Thank you - I now understand the issue!
>>>
>>> After the label checks in IDNA2008 there are many unfixed and known 
>>> issues
>>> that remain to be done somewhere else, such as at the application 
>>> level or
>>> at the registry. For example the Registry should also apply more
>>> restrictive rules during the registration to make their TLD safe but 
>>> this
>>> will not assure safety beyond the second level. Here applications 
>>> will be
>>> expected to take on the safety problems.
>>>
>>>
>>> After the introduction of IDNA, most application developers have been
>>> thinking about secure ways to make sure users will see the correct
>>> domain  Some applications may also change direction from LTR to RTL 
>>> based
>>> on what they detect from the domain's direction. In that case it 
>>> would be
>>> no risk to have a U-label that starts with numbers or contains only
>>> numbers. Thus it may be possible to relax the current proposed rule in
>>> IDNA2008.
>>>   
>> This requires that applications identify correctly all instances of 
>> domain names - my thinking when writing -bidi was that it would be 
>> extremely confusing for users to have a domain name display in one 
>> order when in the address bar, and in another order when in running 
>> text, so I argued that these should be treated identically (last 
>> paragraph of section 6 of -bidi-02), and - based on that argument - 
>> the behaviour of domain names in running text was the behaviour that 
>> it was important to write rules for.
>>
>> I haven't heard anyone argue the opposite position yet, although I've 
>> heard many people wistfully wish that they could make it so. Do you 
>> wish to reopen this argument?
> I think that it doesn't matter where you see the URL, it is important 
> for the users to see the correct URL when they want to use it. Almost 
> all  URL-aware  application more or less process the URL. ( for 
> example thunderbird did it already with this email )
It seems that we agree on this point, then.
>>>
>>> So my suggestion is: Those problems which cannot be almost completely
>>> resolved at the protocol level should be dealt with only at the
>>> informational level, and no rules should be specified about them in the
>>> protocol. One such example is to relax the BIDI rule about numbers, 
>>> which
>>> I mentioned above.
>> I do not understand what you are asking here - what rule (referring 
>> to the numbered list in section 4 of -bidi-02) do you wish to relax, 
>> and which requirement in section 3 of -bidi-02 (which is the basis on 
>> which these rules were designed) do you think we can live without?
>>
>>                          Harald
>>
> I mean, the rule that limited the RTL labels to only start with  RTL 
> characters. I specifically talk  about the rule number 7 in section 04 
> of the bidi-02 document.  We can not  live  without them but there are 
> some issues that have to be solved somewhere. By resolving those 
> issues some of the current concerns also will be solved, and there 
> will be no need to have some limiting rules within the document. My 
> suggestion is to relax those rules as we know their related problems 
> can be solved within the applications.
Thank you - as far as I understand, you identify an inter-label problem 
that can't be solved by intra-label rules. One solution is, of course, 
to go back to inter-label rules and saying that the label '3' can't 
occur next to the label '<ALEF>'; the other solution (which the current 
document advocates in section 5; your example is a perfect illustration 
(except that in this case, the number moved right over the label and out 
on the other side, which I hadn't anticipated).

However, I don't understand how the intra-label problem can be solved 
within the application either.... could you give an example of an usage 
of labels with RTL characters inside them, and a leading EN or AN, and a 
way in which applications can be coded to guarantee that there won't be 
a problem?

It seems to me that if a problem cannot be solved fully at all, and we 
can limit the size of the problem by imposing rules at a given level, 
and those rules don't inhibit any large number of nonproblematic usages, 
we should keep those rules.

But I could be wrong.

                     Harald


More information about the Idna-update mailing list