I sent this almost a month ago, and got no reply. I&#39;m assuming that the lack of response was due to the holidays, and some discussion or response for these items will be forthcoming soon.<br><br>Mark<br><br><div class="gmail_quote">

On Dec 13, 2007 7:43 PM, Mark Davis &lt;<a href="mailto:mark.davis@icu-project.org">mark.davis@icu-project.org</a>&gt; wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

I&#39;ve collected together comments on the four documents, and tried to organize them for reference. Here is the first set.<br><br><h1><a href="http://www.ietf.org/internet-drafts/draft-alvestrand-idna-bidi-01.txt" target="_blank">

http://www.ietf.org/internet-drafts/draft-alvestrand-idna-bidi-01.txt

</a> </h1>

<p>&nbsp;</p>

<h2>Overall comments: </h2>

<p><br></p>

<p>Well documented, with clear examples justifying the problems to be solved. </p>

<p><br></p>

<h2>Details:</h2><br>

<p>Bidi-1. </p><pre>   Note that Unicode 5.0 is the current version of Unicode.  This fix<br>   refers to Unicode 3.2 only, to maintain consistency with the rest of<br>   RFC 3454.  Nothing here should affect the relationship between

<br>   Unicode versions and IDNA.</pre>

<p>But making it specific to U3.2 *does* tie it to a particular

version. Is the intention for this to modify IDNA2003 before IDNAbis

comes out? That doesn&#39;t seem to be the case for the rest of the

documents. Better would be for it to refer to the version of Unicode

used by IDNA (whatever version it is).</p>

<p><br>

</p>

<p>In the same vein, tying the comment to RFC 3454 is limiting as the

solution that the document is proposing is in the context of IDNA-bis

which does away with stringprep/nameprep. Overall the document should

take a more generic view for solution, not just stringprep (RFC 3454)

specific.</p>

<p><br></p>Bidi-2.&nbsp; <pre>   The following conditions MUST be true in both resulting strings for<br>   the string to be acceptable:</pre><pre>   o  The leftmost and rightmost character of the resulting string in<br>      display order must be a full stop (U+002E)

</pre><pre>   o  No non-spacing mark (NSM) can occur in the second position of the       string (leftmost in L order, rightmost in R order); that is, no       mark can be allowed to attach to the delimiting characters.

</pre><pre>   o  The direction of the leftmost and rightmost characters in the<br>      string (the periods) must be either L or R</pre>

<p>The NSM condition should be part of the main IDNA conditions, not here.</p>

<p><br>

</p>

<p>Bidi-2a.</p>

<p><br>

 </p>

<p>If you really want a test, it would be something like the following:</p>

<p><br>

 </p>

<ol><li>At build time, produce a test set T of characters, one from each

of the BIDI classes where a character can be in IDNA (eg excluding B,

LRE/O, RLE/O, and PDF). That is, roughly 14 characters.</li><li>To test a given prospective label L, perform the following over

all possible 2 characters strings X and Y from T. (That is, this would

be 14^4 iterations.)<br>

  </li><li>Create the the string S formed from: X + L+ Y<br>

  </li><li>Apply the BIDI algorithm to S twice, once with a RTL and once with LTR paragraph<br>

directions.</li><li>If in the result and of the characters in the label are separated by a character<br>

from X or Y, the test fails.</li></ol>

<br>

However, this should really not be proposed as something that users of

IDNA should do. Instead, it should be used to test that Michel&#39;s

formulation is correct.<br>

<br>

<p>Bidi-3. </p><pre>   We believe that there is a clear likelihood of similar issues<br>   existing with other scripts and languages that are not currently used<br>   extensively with IDNs.  Careful consideration of all the languages

written in a given script, in consultation with all of the    corresponding speech communities, is therefore needed before we can    say with any degree of certainty that using that script for IDNs is    unproblematic.

</pre>

<p>This is not a bidi issue, and should be in a different document. (See other comments about &quot;speech communities&quot;)</p>

<p><br></p>

<p>Bidi-4.<br></p><pre>   Another set of issues concerns the proper display of IDNs with a<br>   mixture of LTR and RTL labels, or only RTL labels; it is not clear to<br>   these authors what the proper display order of the components of a

domain name are if the directiion of the components (in network    order) is, for instance, FirstRTL.SecondRTL.LTR - is it    LTRtsriF.LTRdnoceS.LTR or LTRdnoceS.LTRtsrif.LTR?  Again, this memo    does not attempt to suggest a solution to this problem.

</pre>

<p>If the question is: what does the BIDI algorithm do in such cases,

the answer is easy to determine. If the question is whether a user

agent should display a URL in a different order than the BIDI

algorithm, I think that&#39;s beyond the scope of this document. Note that

any attempt to have it display differently requires all text processors

to recognize URLs and handle them specially, with problems of

interoperability and confusion when, inevitably, most of them fail. So

recommending a non-standard display will probably do more harm than

good.</p><br>Bidi-5. <pre>   One particular example of the last case is if a program chooses to<br>   examine the last character (in network order) of a string in order to<br>   determine its directionality, rather than its first; if it finds an

<br>   NSM character and tries to display the string as if it was a left-to-<br>   right string, the resulting display may be interesting, but not<br>   useful.</pre>

<p>I don&#39;t understand this paragraph. When and why would this happen with IDNA-conformant programs? </p><br>

</blockquote></div><br><br clear="all"><br>-- <br>Mark