<br clear="all">Mark<br>

<br><br><div class="gmail_quote">On Fri, Apr 10, 2009 at 04:01, &quot;Martin J. Dürst&quot; <span dir="ltr">&lt;<a href="mailto:duerst@it.aoyama.ac.jp">duerst@it.aoyama.ac.jp</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Hello Mark,<br>

<br>

You seem to be right; I forwarded a bug report to Opera.<div class="im"><br>

<br>

On 2009/04/10 12:48, Mark Davis wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

The M-Label string is \uFECB\uFEAE\uFE91\uFEF2<br>

The U-Label string is \u0639\u0631\u0628\u064A<br>

</blockquote>

<br></div>

I guess U+ notation would be better; nobody needs to know you used Java :-).</blockquote><div><br>I can do that if you think it would be less confusing. Or for that matter use &amp;#x...; notation or \x{...} notation. The only problem with U+ notation is that you have to leave spaces around it if you mix with unescaped text; the other notations were designed with the ability to intermix with other text.<br>

 <br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="im"><br>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

The M-Label does map to the U-Label (in IDNA, and in IDNAbis unless we<br>

restrict the mapping to exclude these characters).<br>

</blockquote>

<br></div>

I and others have proposed that we should restrict the mapping for these, unless we get feedback from the Arabic script community otherwise.</blockquote><div><br>I&#39;ll send out another message on this. <br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<div class="im"><br>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Every conformant browser, where fonts are available, should give essentially<br>

the same rendering for both strings (which is why they are mapped together<br>

by NFKC).<br>

</blockquote>

<br></div>

Well, no. On the one hand, NFKC equivalences often clearly render differently (think about circled numbers,...), and on the other hand, compatibility Arabic only renders the same as the normalized version if the connectivity variants are carefully selected to follow the connectivity rules of the Arabic script. As an example, the NFKC form of<br>


ﺶﺷ is شش, which doesn&#39;t look the same.</blockquote><div><br>That is true, and I didn&#39;t say it was the only reason. The key is that when the presentation forms are used, they&#39;ll typically be used with the right shapes for the particular word, which means that if someone cuts and pastes a particular word that looks identical to another, they&#39;ll be confused why one works and the other doesn&#39;t. I agree that that is not the case for many other NFKC cases. <br>

</div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>

<br>

Regards,   Martin.<br><font color="#888888">

<br>

-- <br></font><div><div></div><div class="h5">

#-# Martin J. Dürst, Professor, Aoyama Gakuin University<br>

#-# <a href="http://www.sw.it.aoyama.ac.jp" target="_blank">http://www.sw.it.aoyama.ac.jp</a>   mailto:<a href="mailto:duerst@it.aoyama.ac.jp" target="_blank">duerst@it.aoyama.ac.jp</a><br>

</div></div></blockquote></div><br>