I-D Action:draft-alvestrand-idna-bidi-04.txt (fwd)

Harald Tveit Alvestrand harald at alvestrand.no
Sun Feb 17 06:33:40 CET 2008



--On 15. februar 2008 15:35 -0500 John C Klensin <klensin at jck.com> wrote:

>> Yes, it does. I think the defensive test (and the one that is
>> simplest to code) for a resolver would be to flag anything
>> where the whole domain name contains a R/AL/AN and where any
>> label violates the bidi rule as "possibly confusing".
>>
>> Is it OK to say that one should refuse to look up any such
>> name?
>
> Given especially the DNAME-related cases, I think not.  There
> are too many legitimate names that can be suspicious ("possibly
> confusing") under this sort of rule.

Let's gnaw on this bone a bit...... the resolver can't know whether the 
name being looked up refer to a DNAME or not; the resolver has to make a 
decision based purely on the string it's presented with. The nice thing is 
that it's actually presented with the whole domain name at once, so doesn't 
have to worry about "what can possibly be added to this string" as a 
registry has to.

let's use RTLBAD as a stand-in for a RTL label that fails the test, RTLGOOD 
as one that passes.
For ease of discussion, let's use "abc" for an LTR label that passes the 
test, and "9bc" for a LTR label that starts with a number (there are others 
that will fail, such as "-foo-" - but the numeric one is the most 
frequently encountered, I think).

"Reject" means "Refuse to look up the name"; "Accept" means "Try to look up 
the name".

- RTLBAD.RTLGOOD -> Reject
- RTLBAD.abc -> Reject

- RTLGOOD.RTLGOOD -> Accept
- RTLGOOD.abc -> Accept
- abc.abc -> Accept (no RTL, so no bidi)
- 9bc.abc -> Accept
- abc.9bc -> Accept

I think these are uncontroversial. (Check: All agree?)

- RTLGOOD.9bc -> this can make RTLGOOD display inconsistently, if it ends 
with a number. Accept or refuse?
- RTLGOOD.abc.9bc -> this will display correctly, because "abc" contains 
strong LTR characters. Accept or refuse?
- 9bc.RTLGOOD.abc -> This will probably (?) display correcly, but is 
outside the range of our current tests. Accept or refuse?

Getting the exact list of label pairs/triplets that don't cause trouble is 
complex, and the resulting rules for them are likely to be complex too. So 
far, we've emphasized relatively simple rules.

We could write tests for this, and see what LDH labels can be allowed next 
to RTL labels. Or we could say "a plague on all their houses" and refuse.

Thoughts?

                     Harald








More information about the Idna-update mailing list