Here comes the Yiddish

Martin Duerst duerst@w3.org
Thu, 12 Dec 2002 02:11:47 +0900


At 10:07 02/12/04 +0000, Michael Everson wrote:
>At 19:14 -0700 2002-12-03, Sean M. Burke wrote:
>>At 20:49 2002-12-03 +0000, Michael Everson wrote:
>>>Script codes are intended to be an attribute of a script tag, and for 
>>>the specific needs of modern spell-checkers was the deciding and 
>>>practical business case. There is no script code for 1996. That was a 
>>>language reform.

1996 is a year, not a language reform. And there is of course
an iso standard for dates, including just years :-)


>>Okay, so language codes /can/ cover orthographies, excellent!  I want 
>>language codes for Yiddish in the two common orthographies for it.  What 
>>codes do you suggest?
>
>The case of German is different, because there are no 
>"orthography-within-a-script" codes.

Well, there are no such codes, but conceptionally, orthographies depend
on scripts. I wouldn't know a case where different scripts are used that
would not have influences on the orthography used.

So what would happen if Sean would come up and ask for a few tags
for different orthographies for Yiddish in Hebrew, and different
orthographies for Yiddish in Latin? Would you want to have
conventions such that e.g. yi-1900 and yi-1950 are Hebrew,
and yi-1880 and yi-1920 are Latin? Would be much clearer
and much more workable (fallbacks,...) if the script code
was in there, wouldn't it?


>Latin is not a common orthography for Yiddish. It is an exceptional 
>orthography, surely. I have a book of Yiddish jokes written in Latin. I 
>would not consider it standard.

Does RFC 3066 require that something be 'standard' in order to give
it a code? Would you be okay with a registration application that
said that Latin is not used that much for Yiddish?


>Language codes should not cover script variants of this kind. Script codes 
>should do that. Lang=Az. Script=Cyrl or Script=Latn.

Language codes should not cover countries, either. Lang=en, country=us.
RFC 3066 codes are identifiers that combine codes from various different
standards, and 'made-up' codes.


Regards,    Martin.