Serbo-Croatian continuum: the top level

Mon Mar 3 13:02:55 CET 2014

On 2 Mar 2014, at 19:39, John Cowan <cowan at mercury.ccil.org> wrote:

> (1a) Create variant subtags and attach them to the appropriate primary-language subtags for the national standard languages.  Thus, Kajkavian would be tagged as a variant of "hr".  The difficulties here are twofold:  Kajkavian is much more different from Standard Croatian than the latter is from Standard Serbian or Standard Bosnian,

So? If a text is in Kajkavian, then the subtag will apply. Otherwise it will not. 

> and it is not clear what to do about neo-Shtokavian, which is spread across all the relevant countries.  Indeed, the three standard languages are all sub-subvarieties of neo-Shtokavian.

Why do we have to “do” anything about “neo-Shtokavian”? 

> (1b) Create variant subtags and attach them directly to the macrolanguage
> subtag 'sh', which covers the whole SCC.  This was my earlier proposal,
> and is linguistically correct as far as it goes, but tends to undermine
> the notion of a macrolanguage as a group of _languages_, by effectively
> coordinating languages with varieties.

Well, I don’t know how useful that notion is, really. 

> (1c) We can use our extraordinary powers under Section 2.2.1 subsection
> 5 of RFC 5646 and create our own primary language tags.  The RFC says
> "an attempt to register any new proposed primary language MUST be made
> to the ISO 639 registration authority".  Technically, this would only
> authorize the creation of a tag for Kajkavian, but I think we can take
> it as read that the RA would reject the others on the same grounds.

Why would this be tempting? If we registered a three-letter code, the RA could assign those three letters to something else later. 

This brings to mind another question though. Do we have the power to create our own primary script tags?

> The disadvantages are that BCP 47 primary language tags would no longer
> automatically be ISO 639 code elements, and that the new language tags,
> though substantively encompassed by 'sh', would not formally be so (though
> there seems to be no explicit prohibition on adding Macrolanguage: fields
> to such entries).  Despite these points, I currently favor this solution.

I think I would have to be extraordinarily impressed by the usefulness of this solution to approve it. At present I’m a long way away from that. 

> (2) The constraints on list-created primary-language subtags and
> on variant subtags are the same: 5 to 8 characters.  The worst case
> is that we need five tags, for Kajkavian, Chakavian, neo-Shtokavian,
> palaeo-Shtokavian, and Torlakian.  We already have the subtags "ekavsk"
> and "ijekavsk", but following this slavishly would give us "nshtokavsk",
> which is too long.

Why do we need five? 

> (3) Finally, there remains the question of just which entities to
> tag.  The first three listed above are beyond doubt.  We could merge
> neo-Shtokavian and palaeo-Shtokavian into a single entity if we had to,
> though they are quite different.

Where is this written up?

> Unless I get pushback on this (and I expect and hope to do so), I'll
> propose these five subtags as primary-language subtags sometime next week.

Don’t, not yet.

Michael Everson * http://www.evertype.com/