To be published: draft-alvestrand-idna-bidi-00.txt

Paul Hoffman phoffman at imc.org
Mon Oct 16 02:15:26 CEST 2006


At 1:44 AM +0200 10/16/06, Harald Alvestrand wrote:
>Paul Hoffman wrote:
>>At 11:35 PM +0200 10/15/06, Harald Alvestrand wrote:
>>>
>>
>>Making every receiver correctly add all of the steps of section 3.3 
>>of UAX 9 is onerous and error-prone. A much simpler change would be 
>>to simply say that a character of type NSM is considered to have 
>>the directionality of the base character which it follows.
>>
>>This will fix both the problems listed in this draft, as well as 
>>any related problem where a combining character is following a 
>>RandALCat character.
>>
>My biggest worry is that we'll be right back here in a year if we 
>discover that someone really needs EN, ES, ET, AN, CS, BN, B, S, WS 
>or ON characters in conjunction with RTL strings.... I have not yet 
>figured out how I can be sure what the result is for all of these 
>classes in a way that gives me reassurance that we will never, ever, 
>ever need to allow them.

We could add all those in as well, but I suspect that would be very dangerous.

If our requirements at the moment are Thaana and Yiddish, we should 
keep it simple. If our requirements are to predict the future, our 
work is harder. But we should not offload that work on all the IDNA 
processors of the world.

>My second-biggest worry is that one application will use his 
>UAX9-compliant library to display the string, another will use a 
>"tuned" algorithm that depends on the Stringprep rules, and they 
>will display different results.

But the rule we are making is the same as UAX9, at least for NSMs. As 
you pointed out to me off-list, UAX9 says:

W1. Examine each nonspacing mark (NSM) in the level run, and change 
the type of the NSM to the type of the previous character.

So my proposal is to match that one part of UAX9 in IDNA processing.

>I'm all for simplifying/subsetting UAX9 if we can prove to ourselves 
>that the simplification/subsetting is equivalent under the 
>restricted set of cases we consider - but I'd like to have fairly 
>rigorous argument that this is the case before jumping.

Fully agree.


More information about the Idna-update mailing list