<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body bgcolor="#ffffff" text="#000000">
<div class="moz-text-plain" wrap="true" graphical-quote="true"
style="font-size: 20px;" lang="x-unicode">
<pre wrap="">2009-08-07 ප.ව. 12:59 දින, � ලිව්වා:
</pre>
<blockquote type="cite" style="color: rgb(153, 0, 0);">
<pre wrap=""><span class="moz-txt-citetags">> </span>On 2 aug 2009, at 19.08, Gihan Dias wrote:
<span class="moz-txt-citetags">></span>
<span class="moz-txt-citetags">> </span>
</pre>
<blockquote type="cite" style="color: rgb(153, 0, 0);">
<pre wrap=""><span class="moz-txt-citetags">>> </span>We are very keen to ensure that zero-width joiner (U+200D) is
<span class="moz-txt-citetags">>> </span><b class="moz-txt-star"><span
class="moz-txt-tag">*</span>allowed<span class="moz-txt-tag">*</span></b>
<span class="moz-txt-citetags">>> </span>for domains in the Sinhala script. So please include Sinhala in A.2.
<span class="moz-txt-citetags">>> </span>Also, is the Joining Type OK?
<span class="moz-txt-citetags">>> </span>
</pre>
</blockquote>
<pre wrap=""><span class="moz-txt-citetags">> </span>The rule is at the moment the following:
<span class="moz-txt-citetags">></span>
<span class="moz-txt-citetags">> </span>If Canonical_Combining_Class(Before(cp)) .eq. Virama Then True;
<span class="moz-txt-citetags">></span>
<span class="moz-txt-citetags">> </span>This implies you can use zero-with joiner after SINHALA SIGN AL-LAKUNA
<span class="moz-txt-citetags">> </span>(as it has Canonical_Combining_Class that is Virama, i.e. 9).
<span class="moz-txt-citetags">></span>
<span class="moz-txt-citetags">> </span>I hope that is solves the issues with Sinhala Script.
<span class="moz-txt-citetags">> </span>
</pre>
</blockquote>
<pre wrap="">Patrik,
Yes, the rules in
<a class="moz-txt-link-freetext"
href="http://stupid.domain.name/stuff/draft-ietf-idnabis-tables-06c.txt">http://stupid.domain.name/stuff/draft-ietf-idnabis-tables-06c.txt</a> look
OK (I have not checked the second rule in A.2).
How is one supposed to find this document? It is not referenced from the
WG site, and I had to do some detective work to find it.
Mark,
I this included in your tool at <a class="moz-txt-link-freetext"
href="http://unicode.org/cldr/utility/idna.jsp">http://unicode.org/cldr/utility/idna.jsp</a> ?
Sinhala and Tamil are not in the rules
$Ndeva $deva; [\u200C\u200D] ; fail
$Nbeng $beng; [\u200C\u200D] ; fail
$Nguru $guru; [\u200C\u200D] ; fail
in IDNA CONTEXT RULES (including BIDI) of 2009/04/03 21:12:29
Should they be there? (or should the above three rules not be there?).
Thanks,
Gihan
----
Appendix A.2. ZERO WIDTH NON-JOINER
Code point:
U+200C
Overview:
This may occur in a formally cursive script (such as Arabic) in a
context where it breaks a cursive connection as required for
orthographic rules, as in the Persian language, for example. It
also may occur in Indic scripts in a consonant conjunct context
(immediately following a virama), to control required display of
such conjuncts.
Lookup:
True
Rule Set:
False;
If Canonical_Combining_Class(Before(cp)) .eq. Virama Then True;
If RegExpMatch((Joining_Type:{L,D})(Joining_Type:T)*\u200C
(Joining_Type:T)*(Joining_Type:{R,D})) Then True;
Appendix A.3. ZERO WIDTH JOINER
Code point:
U+200D
Overview:
This may occur in Indic scripts in a consonant conjunct context
(immediately following a virama), to control required display of
such conjuncts.
Lookup:
True
Rule Set:
False;
If Canonical_Combining_Class(Before(cp)) .eq. Virama Then True;
_______________________________________________
Idna-update mailing list
<a class="moz-txt-link-abbreviated"
href="mailto:Idna-update@alvestrand.no">Idna-update@alvestrand.no</a>
<a class="moz-txt-link-freetext"
href="http://www.alvestrand.no/mailman/listinfo/idna-update">http://www.alvestrand.no/mailman/listinfo/idna-update</a>
</pre>
</div>
</body>
</html>