Criteria for languages

Peter Constable petercon at
Wed Dec 9 08:58:36 CET 2009

The basic intent of macrolanguages in ISO 639-3 (and extlang subtags in BCP 47 - the original idea for extended language subtags was related to the origination of macrolanguages) can be understood from the Chinese cases - a prototypical example: zh / zho had been in use for some time, and there was clear precedent for using that for not only Mandarin content but also for Cantonese, Wu and other varieties. (In fact, there were registrations under RFC 1766 that explicitly connected several of these to zh.) At the same time, "Chinese" had been perceived as an individual language, not a collection; yet clearly Cantonese etc. are reasonably deemed distinct languages.

Now, applying to some potential case (framing with certain assumptions): Suppose XXX is used for some developed language Lx and Ly is some less-developed language closely related to Lx; if Ly is being newly coded as YYY, then :

-          If there is evidence indicating that XXX has been used reasonably widely (by some measure) for Ly, then it may make sense to rescope XXX as a macrolanguage entry that encompasses YYY (and an additional new entry XXX' to denote the developed language Lx proper - exclusive of Ly)

-          In the absence of such evidence, it probably makes most sense to deem XXX as denoting only Lx, in which case the scope of XXX is unchanged

Now, if the first case applies, and XXX becomes deemed a macrolanguage, then (and only then) do we here have a potential option to register YYY (and XXX') as an extlang. (Though John has, I think, been arguing that this is a hypothetical option only, and not a real option under an assumed interpretation of RFC 5646.)


From: ietf-languages-bounces at [mailto:ietf-languages-bounces at] On Behalf Of CE Whitehead
Sent: Tuesday, December 08, 2009 4:57 PM
To: ietf-languages at
Subject: Criteria for languages

Hi, I realize that that Latgalian/Latvian and Samogitian/Lithuanian are a bit different than

the Walliser German and Walser German issue--that there's no real standard that is close to Walliser German--the only similarity is that, in each case, having a macro-language might make it easier to match older content with that that has been recently created and tagged with the new subtags (assuming these go through)??

I'm less sure about the purpose of making any of these new subtags extension languages however.

In any case, my understanding has been that--should Latvian be made a macrolanguage and Latgalian an extension language--there would still be a Latgalian language subtag as well and that [ltg] thus would be the best

choice of a subtag although [lv-ltg] (is that right, the macro-language subtag would be lv not lav?) would be well-formed.

So there is one best way to tag documents if the tag is created, regardless of whether Latgalian also becomes an extension language.

I hope that's right.  From the discussion I think not.


C. E. Whitehead

cewcathar at<mailto:cewcathar at>
-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Ietf-languages mailing list