Mixing of AN and EN (Re: Protocol-08 (and status of Defs-04 and Rationale-06))

Harald Alvestrand harald at alvestrand.no
Mon Dec 15 20:39:27 CET 2008


Mark Davis wrote:
> Let me try to shed some light on this. In the Unicode bidi 
> subcommittee, there are four different items that have recently come 
> up regarding BIDI.
>
> The first is just some editorial clarifying text, and has already been 
> discussed and approved by the UTC. This is not relevant to IDNA.
>
> The others were too recent to have been considered by the UTC.
>
> The second is regarding overriding mirroring for archaic scripts. This 
> is not relevant to IDNA.
>
> The third only applies to the embedding/overriding codes, which are 
> not allowed in IDNs: RLE, LRE, RLO, LRO, and PDF. This is not relevant 
> to IDNA. /
>
> /The last is relevant, and came up most recently. It is the following:
>
> There is reasonable disagreement about what the meaning of a 
> particular rule (N1) is, with two possible interpretations. We know 
> the intent of the author, but the intent of the author is outweighed 
> by what the common practice is. That is, the UTC needs to be quite 
> conservative about changes to BIDI, and existing practice is the major 
> consideration. That requires determining, however, what the prevaling 
> practice is, so we're investigating that now.
>
> The practical impact for IDNA is, I think, the following.
>
> 1. As a part of investigating the common practice, we need to consider 
> whether we need to add additional constraints to what Harald has 
> devised. I see two possible approaches:
>
>    1. We can wait until the investigation is competed, and accomodate
>       the results;
>    2. Alternatively, we can add constraints (if need be) that
>       accomplish the goal no matter which of the two interpretations
>       of N1 is being used.
>
>
> 2. We should add to the security considerations for bidi some 
> indication of the fact that while the bidi constraints are intended to 
> ensure "Character Grouping" and "Label Uniqueness" as much as 
> possible, they may not do so for certain cases:
>
>    1. If the label is adjacent to all-ASCII labels (the xxx.3com problem).
>    2. If the particular implementation of the bidi algorithm deviates
>       from the standard.
>
Mark,

what are the 2 interpretations?

FWIW, here's my (horribly inefficient) interpretation of N1:

    # 3.3.4 Resolving neutral types.
    # N1. A sequence of neutrals takes the direction of the surrounding...
    for my $ix (1.. at typelist) {
        if (!has_direction($typelist[$ix])) {
            # find directional in the forward direction
            for my $ix2 ($ix+1.. at typelist) {
                if (has_direction($typelist[$ix2])) {
                    if (effective_direction($typelist[$ix2])
                        eq effective_direction($typelist[$ix-1])) {
                        $typelist[$ix] = 
effective_direction($typelist[$ix-1]);
                    }
                    last;
                }
            }
        }
    }

I don't know what that counts as.

                       Harald



More information about the Idna-update mailing list