comments on draft-ietf-idnabis-bidi

Harald Tveit Alvestrand harald at alvestrand.no
Mon Aug 3 13:34:10 CEST 2009


Matitiahu Allouche skrev:
> In my previous suggestions, I did not take in consideration that the rules 
> are meant to codify also labels which do not contain any RTL characters. 
> Having understood that, here is an updated version of my suggestions:
>
> Definitions:
>
> 1. Bidi domain names are domain names which include at least one RTL 
> label.
>
> 2. A RTL label is a label which contains at least one character of type R 
> or AL or AN.
>
> Rules for RTL labels in Bidi domain names:
>
>    1.  Only characters with the BIDI properties R, AL, AN, EN, ES,
>        CS, ET, ON, BN and NSM are allowed in RTL labels.
>
>    2.  The first position must be a character with Bidi property R or AL.
>
>    3.  The last position must be a character with Bidi property R, AL, EN
>        or AN, followed by zero or more NSM.
>
>    4.  If an EN is present, no AN may be present, and vice versa.
>
>
> Rules for non-RTL labels in Bidi domain names:
>
>    1.  Only characters with the BIDI properties L, EN, ES,
>        CS, ET, ON and NSM are allowed in non-RTL labels.
>
>    2.  The first position must be a character with Bidi property L.
>
>    3.  The last position must be a character with Bidi property L or EN,
>        followed by zero or more NSM, or the two last positions must be 
>        EN followed by ET.
>
>
>   
Thank you again - I have now implemented this algorithm and compared the 
result for the "Character Grouping Requirement" up to a length of 3 
characters (my perl code is chugging on longer strings as we speak).

I hope Erik can take a look at the "Label Uniqueness Requirement", which 
I don't have code to test for.

The difference between the two algorithms seems to be that your proposal 
allows CS and ET within a label, but not at the ends. Was this an 
intentional difference?

                  Harald



More information about the Idna-update mailing list