comments on draft-ietf-idnabis-bidi

Vint Cerf vint at google.com
Tue Feb 10 15:36:39 CET 2009


thanks for these precise comments, Mati.

Harald, I hope you can assess and incorporate as appropriate into a  
revised draft.

vint


Vint Cerf
Google
1818 Library Street, Suite 400
Reston, VA 20190
202-370-5637
vint at google.com




On Feb 10, 2009, at 3:28 AM, Matitiahu Allouche wrote:

>
> My attention was recently drawn to the subject document (version  
> 03) and I have a number of comments.  Some of them are very minor  
> (typos, editorial) and reflect my pedantic mind, but I thought that  
> I could as well help improve the form of the document.  Other  
> comments touch more to the essence, and I will appreciate  
> considering them seriously.
>
> 1) In section 2, first paragraph, "satisifes" should be "satisfies".
>
> 2) Section 2, rule 1 mentions the "Character Grouping requirement"  
> for the first time in the document.  Either there should be a  
> forward reference to section 3 where it will be explained, or  
> (better, in my opinion), the content of the current section 3  
> should precede the content of the current section 2.
>
> 3) In the sentence "ET is excluded because the string L ET does not  
> satisfy the Character Grouping requirement.", "L" seems to  
> represent a label, but can easily be confused with the L Bidi  
> property (all the more since it is adjacent to ET which surely  
> represents a character with the ET Bidi property).
>
> 4) In the sentence "CS is excluded because the string L CS does not  
> satisfy the Character Grouping requirement.", "L" seems to  
> represent a label, but can easily be confused with the L Bidi  
> property (all the more since it is adjacent to CS which surely  
> represents a character with the CS Bidi property).
>
> 5) I see no reason why CS is excluded while ES is allowed.  Both  
> can be the source of the same kind of  violation of the Character  
> Grouping requirement.  ES characters are excluded from the first  
> and last positions by rules 2 and 3.  With the same restrictions  
> (exclusion from the first and last positions), ES and ET characters  
> can be allowed and will not violate the Character Grouping  
> requirement any more than ES characters.
>
> 6) In section 1.1, there appears the following statement: "This  
> specification is not intended to place any requirements on domain  
> names that do not contain right-to-left characters."
> Also the title of section 2 is "A replacement for the RFC 3454 BIDI  
> rule" which implies that the text only deals with "Bidi" labels.
> If that means that the specification applies only to labels which  
> contain at least one character with Bidi property R, AL or AN, and  
> we combine that with rule 4 "If an R, AL or AN is present, no L may  
> be present.", then an L character can never be part of a Bidi  
> label, and the L should be removed from the list of allowed Bidi  
> properties in rule 1.
>
> 7) In [UAX9], rule X9 says that BN characters must be removed from  
> the displayed text.  Any such invisible character violates the  
> Label Uniqueness requirement.  BN characters must not be allowed by  
> rule 1.
>
> 8) From rules 1, 2, 4, 6 and 7, plus our comments 6 and 7 above, it  
> results that the first character of a Bidi label can only be of  
> type R or AL.  Such a statement can advantageously replace rules 2,  
> 6 and 7.
>
> 9) Rule 5 includes no justification.  While a mixture of AN and EN  
> characters in the same label seems odd and not required in real  
> life situations, it is not clear what requirement would be violated  
> by such a combination.
>
> 10) The rules allow AN or EN digits to appear in the last position  
> of a label (in opposition to RFC 3454).  Let us consider the  
> following examples (where lower case letters represent L characters  
> and upper case letters represent R or AL characters):
>
>    a. network order = "ABC123.456xyz"  display order (LTR) =  
> "123.456CBAxyz"  display order (RTL) = "123.456xyzCBA"
>
>    b. network order = "ABC.456-xyz"  display order (LTR) = "456.CBA- 
> xyz"  display order (RTL) = "xyz-456.CBA"
>
>    c. network order = "ABC123.456.xyz"  display order (LTR) =  
> "123.456CBA.xyz"  display order (RTL) = "xyz.123.456CBA"
>
>    d. network order = "ABC.456.xyz"  display order (LTR) =  
> "456.CBA.xyz"  display order (RTL) = "xyz.456.CBA"
>
> Examples a, b and c show very ugly violations of the Character  
> Grouping requirement.  Since the document does not place  
> requirements on non-Bidi labels, any non-Bidi label starting with  
> digits following a Bidi label will cause a Character Grouping  
> violation.  If Bidi labels are restricted from ending with digits  
> (optionally followed by NSMs), then non-Bidi labels which contain  
> only digits (example d) following a Bidi label will not cause a  
> Character Grouping violation.
> Whether this modest benefit justifies imposing such a restriction  
> is subject to discussion.
>
> 11) Towards the end of section 2, there appears the following  
> sentence: "In a domain name consisting of only labels that pass the  
> test, the requirements of Section 3 are satisfied."
> This is not true for domain names like in the examples above,  
> unless non-Bidi labels are excluded, which is a very hard constraint.
>
> 12) The next sentence says: "In a domain name consisting of only  
> LDH-labels and labels that pass the test, the requirements of  
> Section 3 are satisfied as long as a label that starts with an  
> ASCII digit does not come after a right-to-left label that ends in  
> a digit."
> This is not true.  See example b above.
>
> 13) In section 3, there appears the sentence: "the label "123-456"  
> will have a different display order in an RTL context than in a LTR  
> context."
> This is not true, IMHO.  If the last letter before the label is not  
> an Arabic Letter, it will be displayed as "123-456" both in LTR and  
> RTL context.  If it is an Arabic Letter, it will be displayed as  
> "456-123".
>
> 14) In section 3, there appears the sentence: "The Label Uniqueness  
> property should hold true between LTR paragraphs and RTL  
> paragraphs.  This was shown to be unsound."
> In fact, in all cases where Character Grouping and Label Uniqueness  
> are satisfied for each paragraph direction separately, there will  
> be Label Uniqueness between LTR and RTL paragraphs.
>
> 15) In section 3, since an "unproblematic label" can be a label  
> which satisfies the requirements, the clause "any label S1 and S2  
> that is either a label satisfying the requirements or an  
> unproblematic label" can be shortened to "any label S1 and S2 that  
> is an unproblematic label".
>
> 16) In the formal statement of the Label Uniqueness requirement,  
> there is no provision (or exclusion) for the case where L and L'  
> are identical.
>
> 17) In summary I suggest that the rules in section 2 should be  
> reformulated as below.
>
>    1.  Only characters with the BIDI properties R, AL, AN, EN, ES,
>       CS, ET, ON and NSM are allowed in RTL labels.
>
>   2.  The first position must be a character with Bidi property R  
> or AL.
>
>   3.  The last position must be a character with Bidi property R or  
> AL,
>        followed by zero or more NSM.
>
>   3 variant.  The last position must be a character with Bidi  
> property R,
>      AL, EN or AN, followed by zero or more NSM.
>
>   4 (debatable).  If an EN is present, no AN may be present, and vice
>       versa.
>
> It can be seen that this formulation is quite close to that in RFC  
> 3454, while solving all the problems that the subject document aims  
> to solve.
>
>
> Shalom (Regards),  Mati
>           Bidi Architect
>           Globalization Center Of Competency - Bidirectional Scripts
>           IBM Israel
>           Phone: +972 2 5888802    Fax: +972 2 5870333    Mobile:  
> +972 52 2554160
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090210/6ed76aee/attachment.htm 


More information about the Idna-update mailing list