To be published: draft-alvestrand-idna-bidi-00.txt

Harald Alvestrand harald at alvestrand.no
Mon Oct 16 01:44:08 CEST 2006


Paul Hoffman wrote:
> At 11:35 PM +0200 10/15/06, Harald Alvestrand wrote:
>>
>
> Making every receiver correctly add all of the steps of section 3.3 of 
> UAX 9 is onerous and error-prone. A much simpler change would be to 
> simply say that a character of type NSM is considered to have the 
> directionality of the base character which it follows.
>
> This will fix both the problems listed in this draft, as well as any 
> related problem where a combining character is following a RandALCat 
> character.
>
My biggest worry is that we'll be right back here in a year if we 
discover that someone really needs EN, ES, ET, AN, CS, BN, B, S, WS or 
ON characters in conjunction with RTL strings.... I have not yet figured 
out how I can be sure what the result is for all of these classes in a 
way that gives me reassurance that we will never, ever, ever need to 
allow them.

My second-biggest worry is that one application will use his 
UAX9-compliant library to display the string, another will use a "tuned" 
algorithm that depends on the Stringprep rules, and they will display 
different results.

I'm all for simplifying/subsetting UAX9 if we can prove to ourselves 
that the simplification/subsetting is equivalent under the restricted 
set of cases we consider - but I'd like to have fairly rigorous argument 
that this is the case before jumping.

                     Harald




More information about the Idna-update mailing list