comments on draft-ietf-idnabis-bidi
Harald Tveit Alvestrand
harald at alvestrand.no
Mon Aug 3 13:34:10 CEST 2009
Matitiahu Allouche skrev:
> In my previous suggestions, I did not take in consideration that the rules
> are meant to codify also labels which do not contain any RTL characters.
> Having understood that, here is an updated version of my suggestions:
>
> Definitions:
>
> 1. Bidi domain names are domain names which include at least one RTL
> label.
>
> 2. A RTL label is a label which contains at least one character of type R
> or AL or AN.
>
> Rules for RTL labels in Bidi domain names:
>
> 1. Only characters with the BIDI properties R, AL, AN, EN, ES,
> CS, ET, ON, BN and NSM are allowed in RTL labels.
>
> 2. The first position must be a character with Bidi property R or AL.
>
> 3. The last position must be a character with Bidi property R, AL, EN
> or AN, followed by zero or more NSM.
>
> 4. If an EN is present, no AN may be present, and vice versa.
>
>
> Rules for non-RTL labels in Bidi domain names:
>
> 1. Only characters with the BIDI properties L, EN, ES,
> CS, ET, ON and NSM are allowed in non-RTL labels.
>
> 2. The first position must be a character with Bidi property L.
>
> 3. The last position must be a character with Bidi property L or EN,
> followed by zero or more NSM, or the two last positions must be
> EN followed by ET.
>
>
>
Thank you again - I have now implemented this algorithm and compared the
result for the "Character Grouping Requirement" up to a length of 3
characters (my perl code is chugging on longer strings as we speak).
I hope Erik can take a look at the "Label Uniqueness Requirement", which
I don't have code to test for.
The difference between the two algorithms seems to be that your proposal
allows CS and ET within a label, but not at the ends. Was this an
intentional difference?
Harald
More information about the Idna-update
mailing list