I think the simplest course of action is just require the character before to be Hebrew; that is a sufficient limitation on usage.<br><br clear="all">Mark<br>

<br><br><div class="gmail_quote">On Thu, Jul 23, 2009 at 02:41, Matitiahu Allouche <span dir="ltr">&lt;<a href="mailto:matial@il.ibm.com">matial@il.ibm.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

I totally agree with Ken&#39;s analysis of Gershayim usage, and with his<br>

simplified pseudo-code.<br>

<br>

However, I seem to remember somebody mentioning using Gershayim at the<br>

boundary between preceding Hebrew letters and succeeding letters from<br>

another script.  Personally, I see no need for this, and such a label<br>

would probably be disallowed anyway by the rules for Bidi domain names.<br>

Still, if anybody thinks there is such a use case, he/she should speak<br>

now.<br>

<br>

Shalom (Regards),  Mati<br>

           Bidi Architect<br>

           Globalization Center Of Competency - Bidirectional Scripts<br>

           IBM Israel<br>

           Phone: +972 2 5888802    Fax: +972 2 5870333    Mobile: +972 52<br>

2554160<br>

<br>

<br>

<br>

<br>

Kenneth Whistler &lt;<a href="mailto:kenw@sybase.com">kenw@sybase.com</a>&gt;<br>

Sent by: <a href="mailto:idna-update-bounces@alvestrand.no">idna-update-bounces@alvestrand.no</a><br>

22/07/2009 05:07<br>

Please respond to<br>

Kenneth Whistler &lt;<a href="mailto:kenw@sybase.com">kenw@sybase.com</a>&gt;<br>

<br>

<br>

To<br>

<a href="mailto:patrik@frobbit.se">patrik@frobbit.se</a><br>

cc<br>

<a href="mailto:idna-update@alvestrand.no">idna-update@alvestrand.no</a>, <a href="mailto:kenw@sybase.com">kenw@sybase.com</a><br>

Subject<br>

tables-06b.txt: A.8 Gershayim<br>

<div><div></div><div class="h5"><br>

<br>

<br>

<br>

<br>

<br>

Patrik,<br>

<br>

With my general concerns about the pseudo-code<br>

out of the way, I&#39;ll now take up the issue of<br>

how to express the rule set for A.8. HEBREW PUNCTUATION<br>

GERSHAYIM.<br>

<br>

Currently, the relevant parts of the Appendix state:<br>

<br>

Overview:<br>

   The script of the preceding character and the subsequent<br>

   character, if any, MUST be Hebrew.<br>

...<br>

Rule Set:<br>

   False;<br>

   If Script(Before(cp)) .eq.  Hebrew And<br>

      LastChar .eq. cp Then True;<br>

      If Script(Before(cp)) .eq.  Hebrew And<br>

         Script(After(cp)) .eq.  Hebrew Then True;<br>

<br>

First let&#39;s consider what the appropriate context for<br>

the gershayim are in ordinary Hebrew text usage.<br>

<br>

The gershayim are used to indicate that a word is to<br>

be read as an acronym, rather than as a regular word.<br>

Its position in the acronym is between the next-to-last<br>

and the last letters of the non-inflected form of the<br>

acronym. What that means is that it will be preceded<br>

by one or more letters, and will be followed by at<br>

least one letter (and possibly more, if the acronym is<br>

inflected). But it shouldn&#39;t occur at the beginning or<br>

end of a word.<br>

<br>

The gershayim are also used to mark numerical usage of<br>

Hebrew letters, but in the case where a number is<br>

represented by two or more Hebrew numerals. So again,<br>

in that case, it would be internal to the numeral,<br>

and not at the beginning or end.<br>

<br>

Then there is a usage to indicate transliteration of<br>

a foreign word -- but again the position is word-internal,<br>

between the next-to-last and the last character of the<br>

word.<br>

<br>

>From this summary, it would seem that in *normal* usage,<br>

gershayim should always occur internal to a word. If<br>

Mati agrees with that general characterization, then I<br>

believe the context we need to summarize in the Overview<br>

is more constrained:<br>

<br>

Overview:<br>

   The script of the preceding character and the subsequent<br>

   character MUST be Hebrew.<br>

<br>

And I think *more* constrained is good in this case, as<br>

internal to a Hebrew word is much less likely to cause<br>

either confusion with quotation marks or any bidi quirks.<br>

<br>

If we agree on that more constrained statement of the<br>

intended context, then the Rule Set itself can be<br>

simplified to:<br>

<br>

Rule Set:<br>

   False;<br>

   If Script(Before(cp)) .eq. Hebrew And<br>

      Script(After(cp)) .eq. Hebrew Then True;<br>

<br>

Note that with my restatement of the pseudo-code, the<br>

edge cases of gershayim at the beginning or end of a label<br>

will automatically be excluded, because Before(cp)<br>

would evaluate to Undefined at the start of a label<br>

and After(cp) would evaluate to Undefined at the end of<br>

a label.<br>

<br>

I believe this restatement and simplification of A.8<br>

would be of service to the IDNA2008 users.<br>

<br>

--Ken<br>

<br>

_______________________________________________<br>

Idna-update mailing list<br>

<a href="mailto:Idna-update@alvestrand.no">Idna-update@alvestrand.no</a><br>

<a href="http://www.alvestrand.no/mailman/listinfo/idna-update" target="_blank">http://www.alvestrand.no/mailman/listinfo/idna-update</a><br>

</div></div></blockquote></div><br>