Here comes the Yiddish

Thu, 5 Dec 2002 13:13:11 -0600

On 12/04/2002 04:07:19 AM Michael Everson wrote:

>Latin is not a common orthography for Yiddish. It is an exceptional
>orthography, surely.

Which is why we *could* interpret "yi" to imply the Hebrew-script writing
system. I.e., Sean's request could be reduced to just one tag, "yi-Latn".
(See my IUC 21 paper, section 5, "Default Values and Implicit Tagging".)

>Language codes should not cover script variants of this kind. Script
>codes should do that. Lang=Az. Script=Cyrl or Script=Latn.

Things that are truly language ID should not cover script variants. (So,
ISO 639 needs to stay clear of this.) The question is whether RFC 3066
really is about "language" identification. The way I see it, it is already
dealing not only with the notion language but also with other derivative
notions that are distinct from yet related to language.

>No, but it would make the script code standard pointless and I am not
>going to do that. I believe it would be wrong.

Not at all. Just as ISO 639 and ISO 3166 are both still needed even though
RFC 3066 refers to both, ISO 15924 will also be needed even if RFC 3066
refers to it.

>To my mind this comes right out of the blue, to test the RFC.
>Bringing in the German orthography argument is spurious, because, as
>I have said, there is a business case for distinguishing the
>orthographies, and the script codes cannot be used for script tagging
>because there is no script "1996". If German were written in
>Cyrillic, we would not add a language code de-Cyrl for it. We would
>use both language tagging and script tagging.

The only problem with that, Michael, is that you've got these three
notions, Language, Writing System and Orthography that are inter-related,
and they are related in a particular way: Orthography is derivative from
Writing System, and Writing System is derivative from language. There is a
need for infrastructure that can support all three kinds of distinctions.
If you try to divide into two different quantities that get tagged, it
seems to me that grouping years for orthography reforms together with
language is ignoring the actual ontological relationships and is likely to
lead to problems.

>>And I see at least one good reason for allowing it:  It fills a
>>need, easily, simply -- and even somewhat intuitively!
>
>Your simple solution is to use language tagging and script tagging.
>Yiddish in Latin is not a different language from Yiddish in Hebrew.

However we do it, we need to be thinking about the whole problem, tagging
for language-related categories. We cannot solve any one part of the
problem without considering the whole.

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485