<span class="Apple-style-span" style="font-family: Garamond; "><p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><b>Context Rules</b></p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">
<span class="Apple-style-span" style="font-weight: bold; "><br></span></p>Still needs a lot of work, and problems noted in <a id="i90l" href="http://www.alvestrand.no/pipermail/idna-update/2008-November/002964.html" title="http://www.alvestrand.no/pipermail/idna-update/2008-November/002964.html" style="color: rgb(85, 26, 139); ">http://www.alvestrand.no/pipermail/idna-update/2008-November/002964.html</a> haven't been done. Other items:<br>
<p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><br></p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><b><i>Location</i></b></p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">
<br></p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><i>Since this will not be part of the final document: the text will be moved to the IANA registry and be maintained there -- there needs to be a note to the readers and editor to that effect at the top of the section. There should also be an ed note there (in John's style) indicating that the following rules still require much work.</i></p>
<p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><br></p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><b><i>Pseudocode</i></b></p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">
<i>There should be some explanation of the syntax and functions, even if not precise. The syntax needs to be a bit more extended to be useful.</i></p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">
<span class="Apple-style-span" style="font-style: italic; "><br></span></p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><i>I'd suggest defining P to be the current position of the character being tested, F to be the position of the first character, and L to be the position of the last character. Then we don't need constructs such as LastChar, and can be more expressive, because we have to be able to look at more than one character before/after; eg we can then use Script(Character[P-2]) to get the script of the previous to last character. (Note: I include F just so we don't have to decide between zero-based or one-based, but it would be even simpler to do zero-based.)</i></p>
<p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><span class="Apple-style-span" style="font-style: italic; "><br></span></p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">
<i>I'd also prefer just using = instead of .eq., but that's just a preference.</i></p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><span class="Apple-style-span" style="font-style: italic; "><br>
</span></p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><i>The rules need to be carefully reviewed for clarity and consistency with the text (and vice versa). For example, even for a simple case like Garesh there are many problems.</i></p>
<p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><br></p>Overview: The script of the preceding character and the subsequent character, if any, MUST be Hebrew.<div style="margin-top: 0px; margin-bottom: 0px; ">
<span class="Apple-style-span" style="font-style: italic; ">// The scope of "if any" must be clear. Is it to apply to both the preceding and subsequent, or just the subsequent?</span></div><div style="margin-top: 0px; margin-bottom: 0px; ">
<span class="Apple-style-span" style="font-style: italic; ">// And it must not require the second, because it can be final in a word, which means it is fine to follow with "-" or other non-Hebrew.</span></div><div style="margin-top: 0px; margin-bottom: 0px; ">
<br><div style="margin-top: 0px; margin-bottom: 0px; ">Rule Set:</div><div style="margin-top: 0px; margin-bottom: 0px; ">If FirstChar .eq. True then False; </div><div style="margin-top: 0px; margin-bottom: 0px; ">Else If BeforeScript .eq. Hebrew Then </div>
<div style="margin-top: 0px; margin-bottom: 0px; "> If AfterScript .eq. Hebrew Then True; </div><div style="margin-top: 0px; margin-bottom: 0px; "> Else False;</div><div style="margin-top: 0px; margin-bottom: 0px; ">
<br></div><div style="margin-top: 0px; margin-bottom: 0px; "><span class="Apple-style-span" style="font-style: italic; ">// This is missing a trailing Else (made clear by my block indentation)</span></div><div style="margin-top: 0px; margin-bottom: 0px; ">
<span class="Apple-style-span" style="font-style: italic; ">// While it shouldn't require an AfterScript, even the syntax is ill-defined:</span></div><div style="margin-top: 0px; margin-bottom: 0px; "><span class="Apple-style-span" style="font-style: italic; ">// What is the value of AfterScript if there is no character after? There is no check to make sure that it isn't LastChar.</span></div>
<div style="margin-top: 0px; margin-bottom: 0px; "><div style="margin-top: 0px; margin-bottom: 0px; "><br></div><div style="margin-top: 0px; margin-bottom: 0px; "><br></div><div style="margin-top: 0px; margin-bottom: 0px; ">
<b>9. HYPHEN-MINUS</b></div><div style="margin-top: 0px; margin-bottom: 0px; ">Overview: Must appear at the beginning or end of a label.<br>...</div><div style="margin-top: 0px; margin-bottom: 0px; ">Rule Set: </div><div style="margin-top: 0px; margin-bottom: 0px; ">
If FirstChar .eq. True Then False; </div><div style="margin-top: 0px; margin-bottom: 0px; ">If LastChar .eq. Then False; </div><div style="margin-top: 0px; margin-bottom: 0px; ">Else True;<br>=></div><div style="margin-top: 0px; margin-bottom: 0px; ">
Overview: Must appear neither at the beginning nor at the end of a label, and must not be in both the third and fourth positions in the string.</div><div style="margin-top: 0px; margin-bottom: 0px; ">Rule Set:</div><div style="margin-top: 0px; margin-bottom: 0px; ">
If P = F OR P = L Then False; </div><div style="margin-top: 0px; margin-bottom: 0px; ">Else if P = F+2 And Character[P+1] = "-" Then False;</div><div style="margin-top: 0px; margin-bottom: 0px; ">Else if P = F+3 And Character[P-1] = "-" Then False;</div>
<div style="margin-top: 0px; margin-bottom: 0px; ">Else True;</div><div style="margin-top: 0px; margin-bottom: 0px; "><br></div><div style="margin-top: 0px; margin-bottom: 0px; ">Hyphen-Minus is quite unlike the rest of the rules in that we can NEVER have the above 3 conditions changed. We should just remove it from the CONTEXTO rules, since the conditions for its use are in Protocol as a separate condition (Hyphen - P4.3.2.1, although this needs fleshing out, see previous note) from the CONTEXT conditions (P4.3.2.3).</div>
<div style="margin-top: 0px; margin-bottom: 0px; "><br></div><div style="margin-top: 0px; margin-bottom: 0px; "><br></div><div style="margin-top: 0px; margin-bottom: 0px; "><b>10. ZERO WIDTH NON-JOINER</b></div><div style="margin-top: 0px; margin-bottom: 0px; ">
<span style="font-weight: normal; "><div style="margin-top: 0px; margin-bottom: 0px; ">For the rule sets I suggest the following. Rationale: As long as it is pseudocode -- it is made up for this purpose and matches no real programming language -- we should use a pseudocode that actually works to give the same meaning as the prose. And the conditions needed to be tighter, as per <a id="wvj." href="http://unicode.org/reports/tr31/#Layout_and_Format_Control_Characters" title="http://unicode.org/reports/tr31/#Layout_and_Format_Control_Characters">http://unicode.org/reports/tr31/#Layout_and_Format_Control_Characters</a></div>
<div style="margin-top: 0px; margin-bottom: 0px; ">===</div><div style="margin-top: 0px; margin-bottom: 0px; "><br></div><div style="margin-top: 0px; margin-bottom: 0px; "><div style="margin-top: 0px; margin-bottom: 0px; ">
The script must be one in which the use of this character causes significant visual transformation of one or both of the adjacent characters.<br>=><br>The script must be one in which the use of this character causes visual transformation of one or both of the adjacent characters that are required for significant semantic distinctions in at least some cases. This includes ZWNJ after certain Virama characters, and between particular joining characters in cursive scripts like Arabic.</div>
<div style="margin-top: 0px; margin-bottom: 0px; ">[[anchor9a: The script list for this character is _not_ complete and, in particular, more Indic scripts certainly need to be listed.]]</div><div style="margin-top: 0px; margin-bottom: 0px; ">
<br></div></div></span><span style="font-weight: normal; "><div style="margin-top: 0px; margin-bottom: 0px; ">RuleSet</div><div style="margin-top: 0px; margin-bottom: 0px; "><br></div><div style="margin-top: 0px; margin-bottom: 0px; ">
If <span style="font-family: Arial; "><span style="font-family: Garamond; "><font size="2">BeforeScript .eq. ( Deva | Tamil |... ) Then</font></span></span></div><div style="margin-top: 0px; margin-bottom: 0px; "><div style="margin-top: 0px; margin-bottom: 0px; ">
If P = F OR P = L Then False;</div><div style="margin-top: 0px; margin-bottom: 0px; "> Else if Canonical_Combining_Class(Character[P-1]) != Virama Then False;</div><div style="margin-top: 0px; margin-bottom: 0px; "> Else if Not IsLetter(Character[P-2]) Then False;</div>
<div style="margin-top: 0px; margin-bottom: 0px; "> Else if Not ScriptCount(Character[P-2] + Character[P-1]) > 1 Then False;</div><div style="margin-top: 0px; margin-bottom: 0px; "> Else False;</div><div style="margin-top: 0px; margin-bottom: 0px; ">
Else if BeforeScript != Arabic Then False;</div>Else if Not MatchesBefore([[:jt=D:][:jt=L:]][:jt=T:]*) Then False;</div><div style="margin-top: 0px; margin-bottom: 0px; ">Else if Not MatchesAfter([:jt=T:]*[[:jt=D:][:jt=R:]]) Then False;</div>
<div style="margin-top: 0px; margin-bottom: 0px; ">Else True;</div><div style="margin-top: 0px; margin-bottom: 0px; "><br></div>For more information see Section 2.3 Layout and Format Control Characters in [UAX31]. </span></div>
<div style="margin-top: 0px; margin-bottom: 0px; "><br></div><div style="margin-top: 0px; margin-bottom: 0px; "><span style="font-weight: normal; "><br></span><b>11. ZERO WIDTH JOINER<br></b><br><div style="margin-top: 0px; margin-bottom: 0px; ">
The script must be one in which the use of this character causes significant visual transformation of one or both of the adjacent characters.<br>=><br>The script must be one in which the use of this character causes visual transformation of one or both of the adjacent characters that are required for significant semantic distinctions in at least some cases. This includes ZWNJ after certain Virama characters, and between particular joining characters in cursive scripts like Arabic.</div>
<div style="margin-top: 0px; margin-bottom: 0px; ">[[anchor9a: The script list for this character is _not_ complete and, in particular, more Indic scripts certainly need to be listed.]]</div><div style="margin-top: 0px; margin-bottom: 0px; ">
<br></div></div><div style="margin-top: 0px; margin-bottom: 0px; ">RuleSet</div><div style="margin-top: 0px; margin-bottom: 0px; ">If <span style="font-family: Arial; "><span style="font-family: Garamond; "><font size="2">BeforeScript .eq. ( Deva | Tamil |... ) Then</font></span></span></div>
<div style="margin-top: 0px; margin-bottom: 0px; "><div style="margin-top: 0px; margin-bottom: 0px; "> If P = F OR P = L Then False;</div><div style="margin-top: 0px; margin-bottom: 0px; "> Else if Canonical_Combining_Class(Character[P-1]) != Virama Then False;</div>
<div style="margin-top: 0px; margin-bottom: 0px; "> Else if Not IsLetter(Character[P-2]) Then False;</div><div style="margin-top: 0px; margin-bottom: 0px; "> Else if Not ScriptCount(Character[P-2] + Character[P-1]) > 1 Then False;</div>
<div style="margin-top: 0px; margin-bottom: 0px; "> Else False;</div><div style="margin-top: 0px; margin-bottom: 0px; ">Else False;</div></div><div style="margin-top: 0px; margin-bottom: 0px; "><div style="margin-top: 0px; margin-bottom: 0px; ">
<br></div><div style="margin-top: 0px; margin-bottom: 0px; "><br></div><b>14. MODIFIER LETTER PRIME </b><br><br>Add a description: also used in Cyrillic transcription, where it must be after a consonant.</div><div style="margin-top: 0px; margin-bottom: 0px; ">
<br></div>BeforeScript If .eq. Greek Then </div><div style="margin-top: 0px; margin-bottom: 0px; ">...<br>=><br>If IsLetter(Character[-1]) And BeforeScript = Cyrillic Then True;</div><div style="margin-top: 0px; margin-bottom: 0px; ">
...</div></div></span>Mark<br>