Digit restriction (Re: Comments on bidi-04)

Harald Tveit Alvestrand harald at alvestrand.no
Tue Mar 18 19:59:33 CET 2008

John C Klensin skrev:
> --On Tuesday, 11 March, 2008 08:57 -0400 Harald Tveit Alvestrand
> <harald at alvestrand.no> wrote:
>> For instance, the string AN EN, when embedded between 2 dots,
>> will behave as follows in a LTR context:
>> <RLE><PDF>.<AN><EN>.<RLE><PDF> reorders into <AN><EN>.. (that
>> is, both dots jump to the right end of the string). (I hope
>> you agree with me that "sor" and "eor" will both be "R" for
>> the run that goes .<AN><EN>., while the embedding direction is
>> L).
>> This affects 3 of the 53 length-2 combinations allowed by
>> IDNA2003, and 43 of the 375 length-3 combinations allowed by
>> IDNA2003.
>> But - ALL of the strings affected turn out to be eliminated by
>> our currently proposed set of selection criteria. If a string
>> passes the test for "safe BIDI label" in bidi-04, it will not
>> be affected by bidi formatting codes in the text around it.
>> So we can eliminate the restriction.
> Well, some of us still believe that the "safe BIDI label" test
> is too safe.    Certainly it is sufficient, but I'm not sure
> that the restrictions it implies --especially the trailing digit
> restriction-- is going to be acceptable in actual use.
Note: There's a leading digit restriction ("first", not "left", in the 
terminoligy used in the -bidi draft), not a trailing digit restriction. 
The draft says that one could impose either a leading digit restriction 
or a trailing digit restriction, but that there's no need for both - and 
then goes for the leading digit restriction.

If we go all-out for rules that involve adjoining strings, we can allow 
leading digits as long as we impose a context-specific rule that you 
shouldn't let the RTL-character-containing label at the next level have 
a trailing digit. (I think - without having run the test scripts).

But I agree fully with Pete's mode of attack here: If we want to allow 
things that don't satisfy the constraint as stated, we first have to 
restate the constraint, and figure out which strings can be made to 
satisfy it.


More information about the Idna-update mailing list