To be published: draft-alvestrand-idna-bidi-00.txt
Harald Alvestrand
harald at alvestrand.no
Mon Oct 16 01:44:08 CEST 2006
Paul Hoffman wrote:
> At 11:35 PM +0200 10/15/06, Harald Alvestrand wrote:
>>
>
> Making every receiver correctly add all of the steps of section 3.3 of
> UAX 9 is onerous and error-prone. A much simpler change would be to
> simply say that a character of type NSM is considered to have the
> directionality of the base character which it follows.
>
> This will fix both the problems listed in this draft, as well as any
> related problem where a combining character is following a RandALCat
> character.
>
My biggest worry is that we'll be right back here in a year if we
discover that someone really needs EN, ES, ET, AN, CS, BN, B, S, WS or
ON characters in conjunction with RTL strings.... I have not yet figured
out how I can be sure what the result is for all of these classes in a
way that gives me reassurance that we will never, ever, ever need to
allow them.
My second-biggest worry is that one application will use his
UAX9-compliant library to display the string, another will use a "tuned"
algorithm that depends on the Stringprep rules, and they will display
different results.
I'm all for simplifying/subsetting UAX9 if we can prove to ourselves
that the simplification/subsetting is equivalent under the restricted
set of cases we consider - but I'd like to have fairly rigorous argument
that this is the case before jumping.
Harald
More information about the Idna-update
mailing list