<br><div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="Ih2E3d">> However, this should really not be proposed as something that users of
<br>> IDNA should do. Instead, it should be used to test that Michel's<br>> formulation is correct.<br></div>Exactly - I want to test the algorithm before proposing one. However, I<br>don't understand what you wrote above:
<br><br>- if taken as written, it would test the string "A1" by embedding it<br>between the strings "ALEPH BET" and "GIMEL DAV", which certainly would<br>cause the test to fail (the "1" would pick up its directionality from
<br>the surrounding RTL characters, and the whole thing would likely display<br>in the order of "1 DAV GIMEL A BET ALEPH" - I don't have my direction<br>calculator with me). So I'm assuming you're thinking of some separators
<br>- which ones?</blockquote><div><br>Ken offered some comments. I was probably not very clear. The purpose is so that if we have<br><br>abc.def.ghi<br><br>that we don't get an ordering like<br><br>abcd.fe.ghi<br><br>
that is, where characters hop across label boundaries. (lowercase above doesn't mean just English). It is ok to switch the order of fields, or the order within labels, but each field needs to stay intact.<br><br>Now, because the BIDI algorithm has limited scope, we don't need to test all characters, just certain combinations. So the idea is that if we test
<br><br><a href="http://XY.abc.ZW">XY.abc.ZW</a><br><br>for all combinations of X, Y, Z, W (where it makes a difference, so only from a few BIDI categories), we can see whether there is any "hopping". Here "." is a standin, because other characters can delimit labels, like /, ?, #, ...
<br><br>Is that a bit clearer?<br><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">It's certainly a bidi issue too; as you know, one of the driving forces
<br>for the clarification here is the problem of Yiddish written in the<br>Hebrew script. But now that this text is safely embedded in "issues",<br>and the decision is made to link this document to "issues", the need for
<br>this text here is much lessened.</blockquote><div><br>It's not a BIDI issue, meaning an issue caused by text directionality. It is an issue that happens in a script that is bidi, but is unconnected with directionality. That's what I meant.
<br><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I personally think that recommending a non-standard display is a<br>non-starter. We probably need to reformulate this paragraph as "the
<br>result of the Unicode BIDI algorithm is LTRtsriF.LTRdnoceS.LTR, people<br>may be surprised by that, but we can't fix it". I'll have to test that<br>this is true in all cases before saying it in the document, though...
</blockquote><div><br>good.<br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br><div class="Ih2E3d">><br>> Bidi-5.<br>> One particular example of the last case is if a program chooses to
<br>> examine the last character (in network order) of a string in order to<br>> determine its directionality, rather than its first; if it finds an<br>><br>> NSM character and tries to display the string as if it was a left-to-
<br>> right string, the resulting display may be interesting, but not<br>> useful.<br>><br>> I don't understand this paragraph. When and why would this happen with<br>> IDNA-conformant programs?<br>
><br></div>I think the text is clear enough - if you get a label "ALEF BET <some<br>NSM character>", an IDNA2003 program can look at the last character in<br>the string and say "this is not a RTL string", and treat it as if it was
<br>LTR. In IDNA2003, that will be a safe assumption. In IDNAx, it will not<br>be a safe assumption.</blockquote><div><br>I find that a bit odd. The case you are taking is<br><br>A program is looking at an IDNAbis URL, and thinks that it is a valid IDNA2003 URL, and makes some assumptions about it, and things break.
<br><br>This case that you mention is just a tip of a iceberg. There are a *very* large number of assumptions that a program can make about IDNA2003 that will completely break under IDNAbis (as currently drafted). Many, many things would break, not just this, and not just this in BIDI. So I don't see why you are just calling out this one.
<br><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br><br>Suggestions for a clearer way to state it?<br><font color="#888888"><br>
Harald<br><br></font></blockquote></div><br><br clear="all"><br>-- <br>Mark