comments on draft-ietf-idnabis-bidi

Matitiahu Allouche matial at il.ibm.com
Tue Aug 4 09:13:11 CEST 2009


Harald Tveit Alvestrand  asked:
I have now implemented this algorithm (proposed by Matitiahu Allouche) and 
compared the 
result for the "Character Grouping Requirement" up to a length of 3 
characters ...
The difference between the two algorithms seems to be that your proposal 
allows CS and ET within a label, but not at the ends. Was this an 
intentional difference?
<end of quote>

My answer is:  Yes! CS and ET, when not at the ends of a label, do not 
violate any principle.

Shalom (Regards),  Mati
           Bidi Architect
           Globalization Center Of Competency - Bidirectional Scripts
           IBM Israel
           Phone: +972 2 5888802    Fax: +972 2 5870333    Mobile: +972 52 
2554160




Harald Tveit Alvestrand <harald at alvestrand.no> 
03/08/2009 14:34

To
Matitiahu Allouche/Israel/IBM at IBMIL
cc
idna-update at alvestrand.no
Subject
Re: comments on draft-ietf-idnabis-bidi






Matitiahu Allouche skrev:
> In my previous suggestions, I did not take in consideration that the 
rules 
> are meant to codify also labels which do not contain any RTL characters. 

> Having understood that, here is an updated version of my suggestions:
>
> Definitions:
>
> 1. Bidi domain names are domain names which include at least one RTL 
> label.
>
> 2. A RTL label is a label which contains at least one character of type 
R 
> or AL or AN.
>
> Rules for RTL labels in Bidi domain names:
>
>    1.  Only characters with the BIDI properties R, AL, AN, EN, ES,
>        CS, ET, ON, BN and NSM are allowed in RTL labels.
>
>    2.  The first position must be a character with Bidi property R or 
AL.
>
>    3.  The last position must be a character with Bidi property R, AL, 
EN
>        or AN, followed by zero or more NSM.
>
>    4.  If an EN is present, no AN may be present, and vice versa.
>
>
> Rules for non-RTL labels in Bidi domain names:
>
>    1.  Only characters with the BIDI properties L, EN, ES,
>        CS, ET, ON and NSM are allowed in non-RTL labels.
>
>    2.  The first position must be a character with Bidi property L.
>
>    3.  The last position must be a character with Bidi property L or EN,
>        followed by zero or more NSM, or the two last positions must be 
>        EN followed by ET.
>
>
> 
Thank you again - I have now implemented this algorithm and compared the 
result for the "Character Grouping Requirement" up to a length of 3 
characters (my perl code is chugging on longer strings as we speak).

I hope Erik can take a look at the "Label Uniqueness Requirement", which 
I don't have code to test for.

The difference between the two algorithms seems to be that your proposal 
allows CS and ET within a label, but not at the ends. Was this an 
intentional difference?

                  Harald
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090804/535f05d8/attachment.htm 


More information about the Idna-update mailing list