New variant subtags for Serbian language

Sun Nov 17 12:58:32 CET 2013

== Serbian standard language ==

Ekavian and Iyekavian should exist as BCP 47 subtags, as that's the
most common usage of the standard (for language variants) and Serbian
has two official standards.

I was talking with Michael some time ago about additional four
subtags, which would make our standardization life much easier.

Besides different pronunciation (and spelling) of old vowel Jat,
Serbian language has two standard scripts -- Cyrillic and Latin. As
that's a kind of rarely used variants -- contrary, it's commonly used
-- it would be good to have shorter tags for those combinations:

* ec => Ekavian Cyrillic (thus, sr-ec instead of sr-ekavn-cyrl)
* el => Ekavian Latin (...)
* jc => Iyekavian Cyrillic (...)
* jl => Iyekavian Latin (...)

Wikipedia is already using those tags: cf. http://sr.wikipedia.org/sr-el/

In any case, note that the tags for Ekavian and Iyekavian should stay
*before* the tags for Cyrillic and Latin. You are speaking Ekavian or
Iyekavian without writing them.

== Other standards and official languages ==

* Croatian and Bosnian are Iyekavian and Latin. Bosnian standard
allows Cyrillic, as well. (Bosnian and Serbian Iyekavian have
differences in ~50 words, as well as the most of those Serbian words
are correct in Bosnian, but not vice versa.). From the point of
computational linguistics, it would be good if there is a place to put
the information that those structures of those particular languages
are the same.

* Montenegrin official language is still in the phase of development.
If it's about the language used on official pages of Montenegrin
government institutions, it is Serbian Iyekavian with two different
words ("sjutra" instead of "sutra" ["tomorrow"] and "medjed" instead
of "medved" ["bear"]). If it's about the standard proposed by Doclean
Academy of Sciences and Arts, then it's about the language system the
most distant of all other standard languages (it has more phonemes, it
isn't neo-Shtokavian). Thus, I'd leave this issue until Montenegrins
make their own decisions. In both variants, Montenegrin could be
written in Cyrillic and Latin, though Latin is preferred.

== Linguistic situation ==

Few years ago, I presented here the linguistic situation. Here it is again:

* Language systems spoken on the territories of Serbia, Croatia,
Bosnia and Herzegovina and Montenegro (could be called "Serbo-Croatian
in wider sense"):
** Chakavian (should get ISO 639-3 code, has ISO 639-6 code)
** Kaykavian (should get ISO 639-3 code, has ISO 639-6 code)
** Torlakian (should get ISO 639-3 code, has ISO 639-6 code)
** Shtokavian (should get ISO 639-3 code, has ISO 639-6 code)
*** Old Shtokavian dialects
**** Zeta-South Sanjak dialect: basis for Doclean Montenegrin.
**** ...
*** New Shtokavian dialects or neo-Shtokavian; could be called
"Serbo-Croatian in narrower sense".
**** Ikavian dialects of Western Herzegovina
**** Iyekavian dialects of Eastern Herzegovina. This is the basic
dialect for all of the standard languages (except Doclean variant of
Montenegrin).
**** Ekavian dialects of Northern [proper] Serbia and Vojvodina. Those
dialects influenced Serbian Ekavian standard, though Serbian Ekavian
standard is mostly Ekavian variant of Eastern Herzegovina dialect.

On Sun, Nov 17, 2013 at 11:53 AM, Michael Everson <everson at evertype.com> wrote:
> If we are going to do this, I would rather it be done comprehensively for all the dialects of South Slavic. This is a rat-hole of hair-splitting. I’m asking Miloš Rančić to have a look because he has had to clarify this stuff many times for Wikimedia.
>
> On 16 Nov 2013, at 21:03, Goran Rakic <grakic at devbase.net> wrote:
>
>> Dana Sub, 16 Novembar, 2013 18:46 , Doug Ewell je napisao/la
>>>
>>> Even observing the Prefix field, a user could still combine multiple
>>> variants that are mutually exclusive...
>>>
>>> Section 2.2.5 points out that one should not write "de-1901-1996", which
>>> is very much analogous to "sr-ekavn-ijekavn"... the guiding principle in
>>> BCP 47 is "tag content wisely," as described in detail in Section 4.1.
>>
>> Ok, thanks for clarifications.
>>
>>
>>> Incidentally, these requests need to be separated into two registration
>>> forms, one for each proposed variant.
>>
>> I will do the copy-paste and send two new registration forms for each
>> proposed variant subtag tomorrow.
>>
>>
>> Kind regards,
>> Goran Rakic
>>
>> _______________________________________________
>> Ietf-languages mailing list
>> Ietf-languages at alvestrand.no
>> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>
> Michael Everson * http://www.evertype.com/
>