<br><font size=2 face="sans-serif">&nbsp; &nbsp;Hello, Alireza!</font>

<br>

<br><font size=2 face="sans-serif">Thank you for your input. However I

am afraid that I did not fully grasp what you meant.</font>

<br>

<br><font size=2 face="sans-serif">Can you give examples where there can

be visual confusion while passing the bidi rules that I proposed?</font>

<br>

<br><font size=2 face="sans-serif">Can you suggest alternate rules instead

of those that I proposed?</font>

<br>

<br><font size=2 face="sans-serif">I am looking forward to better understand

your point of view.</font>

<br><font size=2 face="sans-serif"><br>

Shalom (Regards), &nbsp;Mati<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Bidi Architect<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Globalization Center Of Competency

- Bidirectional Scripts<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; IBM Israel<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Phone: +972 2 5888802 &nbsp; &nbsp;Fax:

+972 2 5870333 &nbsp; &nbsp;Mobile: +972 52 2554160<br>

</font>

<br>

<br>

<br>

<table width=100%>

<tr valign=top>

<td width=40%><font size=1 face="sans-serif"><b>Alireza Saleh &lt;saleh@nic.ir&gt;</b>

</font>

<p><font size=1 face="sans-serif">10/02/2009 19:27</font>

<td width=59%>

<table width=100%>

<tr valign=top>

<td>

<div align=right><font size=1 face="sans-serif">To</font></div>

<td><font size=1 face="sans-serif">Matitiahu Allouche/Israel/IBM@IBMIL</font>

<tr valign=top>

<td>

<div align=right><font size=1 face="sans-serif">cc</font></div>

<td><font size=1 face="sans-serif">Vint Cerf &lt;vint@google.com&gt;, idna-update@alvestrand.no</font>

<tr valign=top>

<td>

<div align=right><font size=1 face="sans-serif">Subject</font></div>

<td><font size=1 face="sans-serif">Re: comments on draft-ietf-idnabis-bidi</font></table>

<br>

<table>

<tr valign=top>

<td>

<td></table>

<br></table>

<br>

<br>

<br><tt><font size=2>Dear Mati,<br>

<br>

<br>

Thanks for your comments. Your suggestion will lead the BIDI draft to <br>

put more restrict rules for languages using -bidi characters. As long as

<br>

there is no intra label checks in the protocol documents, character <br>

re-ordering and visual confiscations are possible.Consider that we have

<br>

L,R,AN,EN,N character properties and there are some rules which intend

<br>

to make the world safe in this situation. What does happen if some one

<br>

sees these rules in the absence of &nbsp;some character properties ? I

think <br>

it seems very restrictive and unusable in that case. I think that it is

<br>

very rare for a TLD &nbsp;to support characters in all properties. For

<br>

instance, &nbsp;I think that having an Arabic-Script label under an ASCII

TLD <br>

or Hebrew TLD will be strange enough to make users be more careful about

<br>

what they are browsing. &nbsp;What I suggest as an approach for the protocol

<br>

documents is to keep some basic requirements and let the registries <br>

decide about the details.<br>

<br>

<br>

 &gt; � 3 variant. �The last position must be a character with Bidi <br>

property R,<br>

 &gt; � � �AL, EN or AN, followed by zero or more NSM.<br>

<br>

<br>

There are number of examples that can cause visual confusions as I <br>

stated earlier which also pass the current -bidi rules.<br>

<br>

 <br>

<br>

Best<br>

<br>

Alireza<br>

<br>

<br>

Vint Cerf wrote:<br>

<br>

&gt; thanks for these precise comments, Mati.<br>

&gt;<br>

&gt; Harald, I hope you can assess and incorporate as appropriate into

a <br>

&gt; revised draft.<br>

&gt;<br>

&gt; vint<br>

&gt;<br>

&gt;<br>

&gt; Vint Cerf<br>

&gt; Google<br>

&gt; 1818 Library Street, Suite 400<br>

&gt; Reston, VA 20190<br>

&gt; 202-370-5637<br>

&gt; vint@google.com &lt;mailto:vint@google.com&gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt; On Feb 10, 2009, at 3:28 AM, Matitiahu Allouche wrote:<br>

&gt;<br>

&gt;&gt;<br>

&gt;&gt; My attention was recently drawn to the subject document (version

03) <br>

&gt;&gt; and I have a number of comments. �Some of them are very minor

(typos, <br>

&gt;&gt; editorial) and reflect my pedantic mind, but I thought that I

could <br>

&gt;&gt; as well help improve the form of the document. �Other comments

touch <br>

&gt;&gt; more to the essence, and I will appreciate considering them seriously.<br>

&gt;&gt;<br>

&gt;&gt; 1) In section 2, first paragraph, &quot;satisifes&quot; should

be &quot;satisfies&quot;.<br>

&gt;&gt;<br>

&gt;&gt; 2) Section 2, rule 1 mentions the &quot;Character Grouping requirement&quot;

<br>

&gt;&gt; for the first time in the document. �Either there should be a

forward <br>

&gt;&gt; reference to section 3 where it will be explained, or (better,

in my <br>

&gt;&gt; opinion), the content of the current section 3 should precede

the <br>

&gt;&gt; content of the current section 2.<br>

&gt;&gt;<br>

&gt;&gt; 3) In the sentence &quot;ET is excluded because the string L ET

does not <br>

&gt;&gt; satisfy the Character Grouping requirement.&quot;, &quot;L&quot;

seems to represent <br>

&gt;&gt; a label, but can easily be confused with the L Bidi property (all

the <br>

&gt;&gt; more since it is adjacent to ET which surely represents a character

<br>

&gt;&gt; with the ET Bidi property).<br>

&gt;&gt;<br>

&gt;&gt; 4) In the sentence &quot;CS is excluded because the string L CS

does not <br>

&gt;&gt; satisfy the Character Grouping requirement.&quot;, &quot;L&quot;

seems to represent <br>

&gt;&gt; a label, but can easily be confused with the L Bidi property (all

the <br>

&gt;&gt; more since it is adjacent to CS which surely represents a character

<br>

&gt;&gt; with the CS Bidi property).<br>

&gt;&gt;<br>

&gt;&gt; 5) I see no reason why CS is excluded while ES is allowed. �Both

can <br>

&gt;&gt; be the source of the same kind of �violation of the Character

<br>

&gt;&gt; Grouping requirement. �ES characters are excluded from the first

and <br>

&gt;&gt; last positions by rules 2 and 3. �With the same restrictions

<br>

&gt;&gt; (exclusion from the first and last positions), ES and ET characters

<br>

&gt;&gt; can be allowed and will not violate the Character Grouping <br>

&gt;&gt; requirement any more than ES characters.<br>

&gt;&gt;<br>

&gt;&gt; 6) In section 1.1, there appears the following statement: &quot;This

<br>

&gt;&gt; specification is not intended to place any requirements on domain

<br>

&gt;&gt; names that do not contain right-to-left characters.&quot;<br>

&gt;&gt; Also the title of section 2 is &quot;A replacement for the RFC

3454 BIDI <br>

&gt;&gt; rule&quot; which implies that the text only deals with &quot;Bidi&quot;

labels.<br>

&gt;&gt; If that means that the specification applies only to labels which

<br>

&gt;&gt; contain at least one character with Bidi property R, AL or AN,

and we <br>

&gt;&gt; combine that with rule 4 &quot;If an R, AL or AN is present, no

L may be <br>

&gt;&gt; present.&quot;, then an L character can never be part of a Bidi

label, and <br>

&gt;&gt; the L should be removed from the list of allowed Bidi properties

in <br>

&gt;&gt; rule 1.<br>

&gt;&gt;<br>

&gt;&gt; 7) In [UAX9], rule X9 says that BN characters must be removed

from <br>

&gt;&gt; the displayed text. �Any such invisible character violates the

Label <br>

&gt;&gt; Uniqueness requirement. �BN characters must not be allowed by

rule 1.<br>

&gt;&gt;<br>

&gt;&gt; 8) From rules 1, 2, 4, 6 and 7, plus our comments 6 and 7 above,

it <br>

&gt;&gt; results that the first character of a Bidi label can only be of

type <br>

&gt;&gt; R or AL. �Such a statement can advantageously replace rules 2,

6 and 7.<br>

&gt;&gt;<br>

&gt;&gt; 9) Rule 5 includes no justification. �While a mixture of AN and

EN <br>

&gt;&gt; characters in the same label seems odd and not required in real

life <br>

&gt;&gt; situations, it is not clear what requirement would be violated

by <br>

&gt;&gt; such a combination.<br>

&gt;&gt;<br>

&gt;&gt; 10) The rules allow AN or EN digits to appear in the last position

of <br>

&gt;&gt; a label (in opposition to RFC 3454). �Let us consider the following

<br>

&gt;&gt; examples (where lower case letters represent L characters and

upper <br>

&gt;&gt; case letters represent R or AL characters):<br>

&gt;&gt;<br>

&gt;&gt; � �a. network order = &quot;ABC123.456xyz&quot; �display order

(LTR) = <br>

&gt;&gt; &quot;123.456CBAxyz&quot; �display order (RTL) = &quot;123.456xyzCBA&quot;<br>

&gt;&gt;<br>

&gt;&gt; � �b. network order = &quot;ABC.456-xyz&quot; �display order

(LTR) = <br>

&gt;&gt; &quot;456.CBA-xyz&quot; �display order (RTL) = &quot;xyz-456.CBA&quot;<br>

&gt;&gt;<br>

&gt;&gt; � �c. network order = &quot;ABC123.456.xyz&quot; �display order

(LTR) = <br>

&gt;&gt; &quot;123.456CBA.xyz&quot; �display order (RTL) = &quot;xyz.123.456CBA&quot;<br>

&gt;&gt;<br>

&gt;&gt; � �d. network order = &quot;ABC.456.xyz&quot; �display order

(LTR) = <br>

&gt;&gt; &quot;456.CBA.xyz&quot; �display order (RTL) = &quot;xyz.456.CBA&quot;<br>

&gt;&gt;<br>

&gt;&gt; Examples a, b and c show very ugly violations of the Character

<br>

&gt;&gt; Grouping requirement. �Since the document does not place requirements

<br>

&gt;&gt; on non-Bidi labels, any non-Bidi label starting with digits following

<br>

&gt;&gt; a Bidi label will cause a Character Grouping violation. �If Bidi

<br>

&gt;&gt; labels are restricted from ending with digits (optionally followed

by <br>

&gt;&gt; NSMs), then non-Bidi labels which contain only digits (example

d) <br>

&gt;&gt; following a Bidi label will not cause a Character Grouping violation.<br>

&gt;&gt; Whether this modest benefit justifies imposing such a restriction

is <br>

&gt;&gt; subject to discussion.<br>

&gt;&gt;<br>

&gt;&gt; 11) Towards the end of section 2, there appears the following

<br>

&gt;&gt; sentence: &quot;In a domain name consisting of only labels that

pass the <br>

&gt;&gt; test, the requirements of Section 3 are satisfied.&quot;<br>

&gt;&gt; This is not true for domain names like in the examples above,

unless <br>

&gt;&gt; non-Bidi labels are excluded, which is a very hard constraint.<br>

&gt;&gt;<br>

&gt;&gt; 12) The next sentence says: &quot;In a domain name consisting

of only <br>

&gt;&gt; LDH-labels and labels that pass the test, the requirements of

Section <br>

&gt;&gt; 3 are satisfied as long as a label that starts with an ASCII digit

<br>

&gt;&gt; does not come after a right-to-left label that ends in a digit.&quot;<br>

&gt;&gt; This is not true. �See example b above.<br>

&gt;&gt;<br>

&gt;&gt; 13) In section 3, there appears the sentence: &quot;the label

&quot;123-456&quot; <br>

&gt;&gt; will have a different display order in an RTL context than in

a LTR <br>

&gt;&gt; context.&quot;<br>

&gt;&gt; This is not true, IMHO. �If the last letter before the label

is not <br>

&gt;&gt; an Arabic Letter, it will be displayed as &quot;123-456&quot;

both in LTR and <br>

&gt;&gt; RTL context. �If it is an Arabic Letter, it will be displayed

as <br>

&gt;&gt; &quot;456-123&quot;.<br>

&gt;&gt;<br>

&gt;&gt; 14) In section 3, there appears the sentence: &quot;The Label

Uniqueness <br>

&gt;&gt; property should hold true between LTR paragraphs and RTL paragraphs.

<br>

&gt;&gt; �This was shown to be unsound.&quot;<br>

&gt;&gt; In fact, in all cases where Character Grouping and Label Uniqueness

<br>

&gt;&gt; are satisfied for each paragraph direction separately, there will

be <br>

&gt;&gt; Label Uniqueness between LTR and RTL paragraphs.<br>

&gt;&gt;<br>

&gt;&gt; 15) In section 3, since an &quot;unproblematic label&quot; can

be a label which <br>

&gt;&gt; satisfies the requirements, the clause &quot;any label S1 and

S2 that is <br>

&gt;&gt; either a label satisfying the requirements or an unproblematic

label&quot; <br>

&gt;&gt; can be shortened to &quot;any label S1 and S2 that is an unproblematic

<br>

&gt;&gt; label&quot;.<br>

&gt;&gt;<br>

&gt;&gt; 16) In the formal statement of the Label Uniqueness requirement,

<br>

&gt;&gt; there is no provision (or exclusion) for the case where L and

L' are <br>

&gt;&gt; identical.<br>

&gt;&gt;<br>

&gt;&gt; 17) In summary I suggest that the rules in section 2 should be

<br>

&gt;&gt; reformulated as below.<br>

&gt;&gt;<br>

&gt;&gt; � �1. �Only characters with the BIDI properties R, AL, AN,

EN, ES,<br>

&gt;&gt; � � � CS, ET, ON and NSM are allowed in RTL labels.<br>

&gt;&gt;<br>

&gt;&gt; � 2. �The first position must be a character with Bidi property

R or AL.<br>

&gt;&gt;<br>

&gt;&gt; � 3. �The last position must be a character with Bidi property

R or AL,<br>

&gt;&gt; � � � �followed by zero or more NSM.<br>

&gt;&gt;<br>

&gt;&gt; � 3 variant. �The last position must be a character with Bidi

<br>

&gt;&gt; property R,<br>

&gt;&gt; � � �AL, EN or AN, followed by zero or more NSM.<br>

&gt;&gt;<br>

&gt;&gt; � 4 (debatable). �If an EN is present, no AN may be present,

and vice<br>

&gt;&gt; � � � versa.<br>

&gt;&gt;<br>

&gt;&gt; It can be seen that this formulation is quite close to that in

RFC <br>

&gt;&gt; 3454, while solving all the problems that the subject document

aims <br>

&gt;&gt; to solve.<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt; Shalom (Regards), �Mati<br>

&gt;&gt; � � � � � Bidi Architect<br>

&gt;&gt; � � � � � Globalization Center Of Competency - Bidirectional

Scripts<br>

&gt;&gt; � � � � � IBM Israel<br>

&gt;&gt; � � � � � Phone: +972 2 5888802 � �Fax: +972 2 5870333

� �Mobile: <br>

&gt;&gt; +972 52 2554160<br>

&gt;&gt; _______________________________________________<br>

&gt;&gt; Idna-update mailing list<br>

&gt;&gt; Idna-update@alvestrand.no &lt;mailto:Idna-update@alvestrand.no&gt;<br>

&gt;&gt; http://www.alvestrand.no/mailman/listinfo/idna-update<br>

&gt;<br>

&gt; ------------------------------------------------------------------------<br>

&gt;<br>

&gt; _______________________________________________<br>

&gt; Idna-update mailing list<br>

&gt; Idna-update@alvestrand.no<br>

&gt; http://www.alvestrand.no/mailman/listinfo/idna-update<br>

&gt; &nbsp; <br>

<br>

</font></tt>

<br>