Here comes the Yiddish
Thu, 5 Dec 2002 11:52:27 -0600

On 12/02/2002 05:46:18 PM Michael Everson wrote:

>Saying en-scouse is different from en is one thing. Orthographic
>differences are not language codes. lang=yi-Hebr vs lang=yi-Latn is
>not a language distinction. It's lang=yi, script=Hebr or
>script=Latin. Es is di zelbike Sprakh.
>I think have to reject these requests for language codes on these
>grounds. If we need an RFC on script coding somebody should take that

Michael, I think the reality is that RFC 3066 tags are already assumed to
be used to indicate more than just language. Good grief, we've got
well-known authors writing books on i18n telling people these things can be
used to distinguish locales! (*That* is definitely wrong.) Consider the
number of texts tagged with en-US vs. en-GB: it seems to me most of these
are basically distinguishing orthographic differences; they are certainly
not distinguishing between languages, but my point is that I really don't
think most instances are intended to differentiate for reasons of dialect.
You've also got to look at all of the interoperation already implemented:
there are implementations in which RFC 3066 tags are being associated with
proprietary identifiers that are distinguishing writing systems and

If you recall my presentation from IUC 21 in Dublin, we need to deal with
distinct ontological notions related to language; key among these are
writing systems and orthographies. It seems to make sense to me that ISO
639 should be used to identify languages, and that RFC 3066 (or its
successors) is the appropriate place to handle creating identifiers for
writing systems and orthographies. (But non-linguistic issues in "locales"
are definitely out of scope.)

The fact that Yiddish is Yiddish no matter which script it is written in
doesn't change the fact that there needs to be a way to identify the two
writing systems. I think the reality is that RFC 3066 tags are already
being used in this way. I think it is appropriate for us to do this.

