FW: New variant subtags for Serbian language

Wed Nov 20 02:01:20 CET 2013

Doug Ewell scripsit:

> Serbian content should be tagged as Serbian (with a variant subtag
> for dialect, iff appropriate), Croatian content should be tagged as
> Croatian, Bosnian content as Bosnian, and so forth.
>
> It seems strange that one would want to tag content as being
> specifically in the Chakavian dialect (or Ijekavian or whatever)
> but not as a more specific language than "Serbo-Croatian."

Here's the best analogy I can come up with.  Suppose that for
nationalistic reasons, the English language was split into "British"
and "American", two standard languages, very similar to each other.
(For the purposes of this analogy, I'm pretending all the other anglophone
countries don't exist.)  Then the question might arise:  What language
do the people in the northern part of Great Britain speak?

They are British people, and they are citizens of Great Britain, so
one idea would be to classify their language as a dialect of British.
But the fact is, it's far more different from either Standard British
or Standard American than either is from the other.  What's more, the
language has some prestige, because the present capital of Great Britain
is located there.  (Okay, stretching the analogy here.)  So Britons who
move to the capital have some incentive to adopt local forms of speech,
though it's alien enough that they can't do it very well, and neither
can Americans.

That's pretty much the story of Standard Croatian (British), Standard
Serbian (American), and Kajkavian (Scots).  Like Scots, Kajkavian is
spoken in the northern part of one country only; like Scots, it's much
more different from the Croatian and Serbian standards than they are
from each other; like Scots, it was heavily influenced by a nearby
language that is *not* part of the continuum but is distantly related
(Norse for Scots, Slovene for Kajkavian); like Scots, it was once a
literary language and is now undergoing a revival as such.

For these and perhaps other reasons, the ISO RA has decided to treat
Scots as a separate language, even though it is unquestionably part of
the English dialect continuum.  They have *not* seen fit to do so with
Kajkavian, Chakavian, Torlakian, or palaeo-Shtokavian, though all of
these are in the same general situation as Scots.  So people who want to
tag resources in these language varieties (older documents, extremely new
documents, or speech recordings) can either lobby the RA for a change, or
they can lobby us for subtags.  If we assign subtags kajkavsk, chakavsk,
torlak, palaeo, then the question is, what other subtags are they to be
used with?

One choice is to attach them to the generic tag "sh", since they are
all varieties of "Serbo-Croat in the wider sense".  Another is to attach
them to the tags that name the national languages of the countries they
are spoken in, which is feasible for Kajkavian and Chakavian (Croatia)
and Torlakian (Serbia), but not so much for palaeo-Shtokavian, varieties
of which are spoken in all four countries.  There is a similar problem for
non-standard varieties of neo-Shtokavian, aka "Serbo-Croat in the narrower
sense", the variety that directly encompasses the standard languages.

If, as Michael desires, we are to have a comprehensive set of subtags
for the whole dialect continuum, we must solve the problem of how the
subtags are to be employed, along one of the two lines above or along
some third line that I have not thought of.

-- 
John Cowan                                   cowan at ccil.org
        "You need a change: try Canada"  "You need a change: try China"
                --fortune cookies opened by a couple that I know