tables-06b.txt: A.8 Gershayim

Patrik Fältström patrik at frobbit.se
Fri Jul 24 17:58:37 CEST 2009


I have chosen this in 06c.

    paf

On 23 jul 2009, at 16.28, Mark Davis ⌛ wrote:

> I think the simplest course of action is just require the character  
> before
> to be Hebrew; that is a sufficient limitation on usage.
>
> Mark
>
>
> On Thu, Jul 23, 2009 at 02:41, Matitiahu Allouche  
> <matial at il.ibm.com> wrote:
>
>> I totally agree with Ken's analysis of Gershayim usage, and with his
>> simplified pseudo-code.
>>
>> However, I seem to remember somebody mentioning using Gershayim at  
>> the
>> boundary between preceding Hebrew letters and succeeding letters from
>> another script.  Personally, I see no need for this, and such a label
>> would probably be disallowed anyway by the rules for Bidi domain  
>> names.
>> Still, if anybody thinks there is such a use case, he/she should  
>> speak
>> now.
>>
>> Shalom (Regards),  Mati
>>          Bidi Architect
>>          Globalization Center Of Competency - Bidirectional Scripts
>>          IBM Israel
>>          Phone: +972 2 5888802    Fax: +972 2 5870333    Mobile:  
>> +972 52
>> 2554160
>>
>>
>>
>>
>> Kenneth Whistler <kenw at sybase.com>
>> Sent by: idna-update-bounces at alvestrand.no
>> 22/07/2009 05:07
>> Please respond to
>> Kenneth Whistler <kenw at sybase.com>
>>
>>
>> To
>> patrik at frobbit.se
>> cc
>> idna-update at alvestrand.no, kenw at sybase.com
>> Subject
>> tables-06b.txt: A.8 Gershayim
>>
>>
>>
>>
>>
>>
>> Patrik,
>>
>> With my general concerns about the pseudo-code
>> out of the way, I'll now take up the issue of
>> how to express the rule set for A.8. HEBREW PUNCTUATION
>> GERSHAYIM.
>>
>> Currently, the relevant parts of the Appendix state:
>>
>> Overview:
>>  The script of the preceding character and the subsequent
>>  character, if any, MUST be Hebrew.
>> ...
>> Rule Set:
>>  False;
>>  If Script(Before(cp)) .eq.  Hebrew And
>>     LastChar .eq. cp Then True;
>>     If Script(Before(cp)) .eq.  Hebrew And
>>        Script(After(cp)) .eq.  Hebrew Then True;
>>
>> First let's consider what the appropriate context for
>> the gershayim are in ordinary Hebrew text usage.
>>
>> The gershayim are used to indicate that a word is to
>> be read as an acronym, rather than as a regular word.
>> Its position in the acronym is between the next-to-last
>> and the last letters of the non-inflected form of the
>> acronym. What that means is that it will be preceded
>> by one or more letters, and will be followed by at
>> least one letter (and possibly more, if the acronym is
>> inflected). But it shouldn't occur at the beginning or
>> end of a word.
>>
>> The gershayim are also used to mark numerical usage of
>> Hebrew letters, but in the case where a number is
>> represented by two or more Hebrew numerals. So again,
>> in that case, it would be internal to the numeral,
>> and not at the beginning or end.
>>
>> Then there is a usage to indicate transliteration of
>> a foreign word -- but again the position is word-internal,
>> between the next-to-last and the last character of the
>> word.
>>
>> From this summary, it would seem that in *normal* usage,
>> gershayim should always occur internal to a word. If
>> Mati agrees with that general characterization, then I
>> believe the context we need to summarize in the Overview
>> is more constrained:
>>
>> Overview:
>>  The script of the preceding character and the subsequent
>>  character MUST be Hebrew.
>>
>> And I think *more* constrained is good in this case, as
>> internal to a Hebrew word is much less likely to cause
>> either confusion with quotation marks or any bidi quirks.
>>
>> If we agree on that more constrained statement of the
>> intended context, then the Rule Set itself can be
>> simplified to:
>>
>> Rule Set:
>>  False;
>>  If Script(Before(cp)) .eq. Hebrew And
>>     Script(After(cp)) .eq. Hebrew Then True;
>>
>> Note that with my restatement of the pseudo-code, the
>> edge cases of gershayim at the beginning or end of a label
>> will automatically be excluded, because Before(cp)
>> would evaluate to Undefined at the start of a label
>> and After(cp) would evaluate to Undefined at the end of
>> a label.
>>
>> I believe this restatement and simplification of A.8
>> would be of service to the IDNA2008 users.
>>
>> --Ken
>>
>> _______________________________________________
>> Idna-update mailing list
>> Idna-update at alvestrand.no
>> http://www.alvestrand.no/mailman/listinfo/idna-update
>>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://www.alvestrand.no/pipermail/idna-update/attachments/20090724/b0c02cd6/attachment.pgp 


More information about the Idna-update mailing list