Here comes the Yiddish

Wed, 4 Dec 2002 11:07:47 -0000

> >I do not want us to go down the road of making a "language code" for
> >Portuguese written in the Arabic script.
>
> The mere fact that a formalism is potentially overly powerful isn't a
> problem in and of itself.  The fact that one can construct a
> AA-BB code for
> "Afrihili as spoken on Pitcairn Island" isn't a good argument
> against AA-BB
> codes, in the same way that one could conceivably express "Portuguese
> written in the Arabic script" isn't an argument against expressing
> orthographic variance at the level of different scripts.

Actually I think that example is a good argument against encoding
orthographic differences in RFC3066. I can use afh-PN and everyone will know
it is "Afrihili as spoken on Pitcairn Island" (although some optimisations
may just label it "unsupported language", at least they'll be accurate in
that assumption). We don't need to register it, it comes "for free" with the
aa-BB and aaa-BB conventions.

Now I don't need to elaborate here as to why we do this (that would be
teaching one's grandmother to suck eggs). My point is that I think the same
applies to orthographic differences. If we are to enable the encoding of
orthography (and I think we should, though whether within or without the
language code I'm not sure) we need to do so in a manner that similarly
gives us many language/orthography combinations "for free".

The problem isn't that 'one could conceivably express "Portuguese written in
the Arabic script"'. The problem is that currently, without registration,
one cannot. Yiddish in Hebrew, Yiddish in Roman, Portuguese in Arabic and
Old Irish in Ogham (one I'd personally like to see) should become encodable
in one fell swoop.