<br><font size=2 face="sans-serif"> Hello, Alireza!</font>
<br>
<br><font size=2 face="sans-serif">Thank you for your input. However I
am afraid that I did not fully grasp what you meant.</font>
<br>
<br><font size=2 face="sans-serif">Can you give examples where there can
be visual confusion while passing the bidi rules that I proposed?</font>
<br>
<br><font size=2 face="sans-serif">Can you suggest alternate rules instead
of those that I proposed?</font>
<br>
<br><font size=2 face="sans-serif">I am looking forward to better understand
your point of view.</font>
<br><font size=2 face="sans-serif"><br>
Shalom (Regards), Mati<br>
Bidi Architect<br>
Globalization Center Of Competency
- Bidirectional Scripts<br>
IBM Israel<br>
Phone: +972 2 5888802 Fax:
+972 2 5870333 Mobile: +972 52 2554160<br>
</font>
<br>
<br>
<br>
<table width=100%>
<tr valign=top>
<td width=40%><font size=1 face="sans-serif"><b>Alireza Saleh <saleh@nic.ir></b>
</font>
<p><font size=1 face="sans-serif">10/02/2009 19:27</font>
<td width=59%>
<table width=100%>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">To</font></div>
<td><font size=1 face="sans-serif">Matitiahu Allouche/Israel/IBM@IBMIL</font>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">cc</font></div>
<td><font size=1 face="sans-serif">Vint Cerf <vint@google.com>, idna-update@alvestrand.no</font>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">Subject</font></div>
<td><font size=1 face="sans-serif">Re: comments on draft-ietf-idnabis-bidi</font></table>
<br>
<table>
<tr valign=top>
<td>
<td></table>
<br></table>
<br>
<br>
<br><tt><font size=2>Dear Mati,<br>
<br>
<br>
Thanks for your comments. Your suggestion will lead the BIDI draft to <br>
put more restrict rules for languages using -bidi characters. As long as
<br>
there is no intra label checks in the protocol documents, character <br>
re-ordering and visual confiscations are possible.Consider that we have
<br>
L,R,AN,EN,N character properties and there are some rules which intend
<br>
to make the world safe in this situation. What does happen if some one
<br>
sees these rules in the absence of some character properties ? I
think <br>
it seems very restrictive and unusable in that case. I think that it is
<br>
very rare for a TLD to support characters in all properties. For
<br>
instance, I think that having an Arabic-Script label under an ASCII
TLD <br>
or Hebrew TLD will be strange enough to make users be more careful about
<br>
what they are browsing. What I suggest as an approach for the protocol
<br>
documents is to keep some basic requirements and let the registries <br>
decide about the details.<br>
<br>
<br>
> � 3 variant. �The last position must be a character with Bidi <br>
property R,<br>
> � � �AL, EN or AN, followed by zero or more NSM.<br>
<br>
<br>
There are number of examples that can cause visual confusions as I <br>
stated earlier which also pass the current -bidi rules.<br>
<br>
<br>
<br>
Best<br>
<br>
Alireza<br>
<br>
<br>
Vint Cerf wrote:<br>
<br>
> thanks for these precise comments, Mati.<br>
><br>
> Harald, I hope you can assess and incorporate as appropriate into
a <br>
> revised draft.<br>
><br>
> vint<br>
><br>
><br>
> Vint Cerf<br>
> Google<br>
> 1818 Library Street, Suite 400<br>
> Reston, VA 20190<br>
> 202-370-5637<br>
> vint@google.com <mailto:vint@google.com><br>
><br>
><br>
><br>
><br>
> On Feb 10, 2009, at 3:28 AM, Matitiahu Allouche wrote:<br>
><br>
>><br>
>> My attention was recently drawn to the subject document (version
03) <br>
>> and I have a number of comments. �Some of them are very minor
(typos, <br>
>> editorial) and reflect my pedantic mind, but I thought that I
could <br>
>> as well help improve the form of the document. �Other comments
touch <br>
>> more to the essence, and I will appreciate considering them seriously.<br>
>><br>
>> 1) In section 2, first paragraph, "satisifes" should
be "satisfies".<br>
>><br>
>> 2) Section 2, rule 1 mentions the "Character Grouping requirement"
<br>
>> for the first time in the document. �Either there should be a
forward <br>
>> reference to section 3 where it will be explained, or (better,
in my <br>
>> opinion), the content of the current section 3 should precede
the <br>
>> content of the current section 2.<br>
>><br>
>> 3) In the sentence "ET is excluded because the string L ET
does not <br>
>> satisfy the Character Grouping requirement.", "L"
seems to represent <br>
>> a label, but can easily be confused with the L Bidi property (all
the <br>
>> more since it is adjacent to ET which surely represents a character
<br>
>> with the ET Bidi property).<br>
>><br>
>> 4) In the sentence "CS is excluded because the string L CS
does not <br>
>> satisfy the Character Grouping requirement.", "L"
seems to represent <br>
>> a label, but can easily be confused with the L Bidi property (all
the <br>
>> more since it is adjacent to CS which surely represents a character
<br>
>> with the CS Bidi property).<br>
>><br>
>> 5) I see no reason why CS is excluded while ES is allowed. �Both
can <br>
>> be the source of the same kind of �violation of the Character
<br>
>> Grouping requirement. �ES characters are excluded from the first
and <br>
>> last positions by rules 2 and 3. �With the same restrictions
<br>
>> (exclusion from the first and last positions), ES and ET characters
<br>
>> can be allowed and will not violate the Character Grouping <br>
>> requirement any more than ES characters.<br>
>><br>
>> 6) In section 1.1, there appears the following statement: "This
<br>
>> specification is not intended to place any requirements on domain
<br>
>> names that do not contain right-to-left characters."<br>
>> Also the title of section 2 is "A replacement for the RFC
3454 BIDI <br>
>> rule" which implies that the text only deals with "Bidi"
labels.<br>
>> If that means that the specification applies only to labels which
<br>
>> contain at least one character with Bidi property R, AL or AN,
and we <br>
>> combine that with rule 4 "If an R, AL or AN is present, no
L may be <br>
>> present.", then an L character can never be part of a Bidi
label, and <br>
>> the L should be removed from the list of allowed Bidi properties
in <br>
>> rule 1.<br>
>><br>
>> 7) In [UAX9], rule X9 says that BN characters must be removed
from <br>
>> the displayed text. �Any such invisible character violates the
Label <br>
>> Uniqueness requirement. �BN characters must not be allowed by
rule 1.<br>
>><br>
>> 8) From rules 1, 2, 4, 6 and 7, plus our comments 6 and 7 above,
it <br>
>> results that the first character of a Bidi label can only be of
type <br>
>> R or AL. �Such a statement can advantageously replace rules 2,
6 and 7.<br>
>><br>
>> 9) Rule 5 includes no justification. �While a mixture of AN and
EN <br>
>> characters in the same label seems odd and not required in real
life <br>
>> situations, it is not clear what requirement would be violated
by <br>
>> such a combination.<br>
>><br>
>> 10) The rules allow AN or EN digits to appear in the last position
of <br>
>> a label (in opposition to RFC 3454). �Let us consider the following
<br>
>> examples (where lower case letters represent L characters and
upper <br>
>> case letters represent R or AL characters):<br>
>><br>
>> � �a. network order = "ABC123.456xyz" �display order
(LTR) = <br>
>> "123.456CBAxyz" �display order (RTL) = "123.456xyzCBA"<br>
>><br>
>> � �b. network order = "ABC.456-xyz" �display order
(LTR) = <br>
>> "456.CBA-xyz" �display order (RTL) = "xyz-456.CBA"<br>
>><br>
>> � �c. network order = "ABC123.456.xyz" �display order
(LTR) = <br>
>> "123.456CBA.xyz" �display order (RTL) = "xyz.123.456CBA"<br>
>><br>
>> � �d. network order = "ABC.456.xyz" �display order
(LTR) = <br>
>> "456.CBA.xyz" �display order (RTL) = "xyz.456.CBA"<br>
>><br>
>> Examples a, b and c show very ugly violations of the Character
<br>
>> Grouping requirement. �Since the document does not place requirements
<br>
>> on non-Bidi labels, any non-Bidi label starting with digits following
<br>
>> a Bidi label will cause a Character Grouping violation. �If Bidi
<br>
>> labels are restricted from ending with digits (optionally followed
by <br>
>> NSMs), then non-Bidi labels which contain only digits (example
d) <br>
>> following a Bidi label will not cause a Character Grouping violation.<br>
>> Whether this modest benefit justifies imposing such a restriction
is <br>
>> subject to discussion.<br>
>><br>
>> 11) Towards the end of section 2, there appears the following
<br>
>> sentence: "In a domain name consisting of only labels that
pass the <br>
>> test, the requirements of Section 3 are satisfied."<br>
>> This is not true for domain names like in the examples above,
unless <br>
>> non-Bidi labels are excluded, which is a very hard constraint.<br>
>><br>
>> 12) The next sentence says: "In a domain name consisting
of only <br>
>> LDH-labels and labels that pass the test, the requirements of
Section <br>
>> 3 are satisfied as long as a label that starts with an ASCII digit
<br>
>> does not come after a right-to-left label that ends in a digit."<br>
>> This is not true. �See example b above.<br>
>><br>
>> 13) In section 3, there appears the sentence: "the label
"123-456" <br>
>> will have a different display order in an RTL context than in
a LTR <br>
>> context."<br>
>> This is not true, IMHO. �If the last letter before the label
is not <br>
>> an Arabic Letter, it will be displayed as "123-456"
both in LTR and <br>
>> RTL context. �If it is an Arabic Letter, it will be displayed
as <br>
>> "456-123".<br>
>><br>
>> 14) In section 3, there appears the sentence: "The Label
Uniqueness <br>
>> property should hold true between LTR paragraphs and RTL paragraphs.
<br>
>> �This was shown to be unsound."<br>
>> In fact, in all cases where Character Grouping and Label Uniqueness
<br>
>> are satisfied for each paragraph direction separately, there will
be <br>
>> Label Uniqueness between LTR and RTL paragraphs.<br>
>><br>
>> 15) In section 3, since an "unproblematic label" can
be a label which <br>
>> satisfies the requirements, the clause "any label S1 and
S2 that is <br>
>> either a label satisfying the requirements or an unproblematic
label" <br>
>> can be shortened to "any label S1 and S2 that is an unproblematic
<br>
>> label".<br>
>><br>
>> 16) In the formal statement of the Label Uniqueness requirement,
<br>
>> there is no provision (or exclusion) for the case where L and
L' are <br>
>> identical.<br>
>><br>
>> 17) In summary I suggest that the rules in section 2 should be
<br>
>> reformulated as below.<br>
>><br>
>> � �1. �Only characters with the BIDI properties R, AL, AN,
EN, ES,<br>
>> � � � CS, ET, ON and NSM are allowed in RTL labels.<br>
>><br>
>> � 2. �The first position must be a character with Bidi property
R or AL.<br>
>><br>
>> � 3. �The last position must be a character with Bidi property
R or AL,<br>
>> � � � �followed by zero or more NSM.<br>
>><br>
>> � 3 variant. �The last position must be a character with Bidi
<br>
>> property R,<br>
>> � � �AL, EN or AN, followed by zero or more NSM.<br>
>><br>
>> � 4 (debatable). �If an EN is present, no AN may be present,
and vice<br>
>> � � � versa.<br>
>><br>
>> It can be seen that this formulation is quite close to that in
RFC <br>
>> 3454, while solving all the problems that the subject document
aims <br>
>> to solve.<br>
>><br>
>><br>
>> Shalom (Regards), �Mati<br>
>> � � � � � Bidi Architect<br>
>> � � � � � Globalization Center Of Competency - Bidirectional
Scripts<br>
>> � � � � � IBM Israel<br>
>> � � � � � Phone: +972 2 5888802 � �Fax: +972 2 5870333
� �Mobile: <br>
>> +972 52 2554160<br>
>> _______________________________________________<br>
>> Idna-update mailing list<br>
>> Idna-update@alvestrand.no <mailto:Idna-update@alvestrand.no><br>
>> http://www.alvestrand.no/mailman/listinfo/idna-update<br>
><br>
> ------------------------------------------------------------------------<br>
><br>
> _______________________________________________<br>
> Idna-update mailing list<br>
> Idna-update@alvestrand.no<br>
> http://www.alvestrand.no/mailman/listinfo/idna-update<br>
> <br>
<br>
</font></tt>
<br>