Criteria for languages

Peter Constable petercon at
Thu Dec 10 02:33:42 CET 2009

You're trying to think of why it would be useful to use extended language subtags* in some general sense. I was simply trying to account for how they arose and the pre-existing practice that led to them - which didn't necessarily arise because it made particular sense.

There were long discussions as 4646bis was worked on debating ways in which extlang subtags might be / are / are not useful, and there was no clear consensus on that topic. We'll see how eager people are to take that up again now.


From: CE Whitehead [mailto:cewcathar at]
Sent: Wednesday, December 09, 2009 11:34 AM
To: Peter Constable; ietf-languages at
Subject: RE: Criteria for languages

Hi, Peter,

thanks for your history of the Chinese case.

However, what confuses me is why are some languages made extension languages?

It makes sense to me to make a language an extension language of a macrolanguage if all the extension languages under that macrolanguage are generally written more or less identically (as is the case for the Chinese languages written in Chinese script; similarly in Arabic, standard Arabic is the only written form).  Then it would make sense to allow users to tag these languages using both the macro-language code and the extension-language code--
wherease if the languages appear different in writing, I don't see any reason to tag them with anything but their own unique code, ever.

I don't know if this idea is off-base or not.  If my idea is not off-base, then I would not think that either the Latvian/Latgalian pair or the Lithuainian/Samogithian pair would need to be registered as extension languages once the new codes are in place (assuming these codes will be approved).


C. E. Whitehead
cewcathar at<mailto:cewcathar at>
From: petercon at<mailto:petercon at>
To: cewcathar at<mailto:cewcathar at>; ietf-languages at<mailto:ietf-languages at>
Subject: RE: Criteria for languages
Date: Wed, 9 Dec 2009 07:58:36 +0000

> The basic intent of macrolanguages in ISO 639-3 (and extlang subtags in BCP 47 - the original idea for extended language subtags was related to the origination of macrolanguages) can be understood from the Chinese cases - a prototypical example: zh / zho had been in use for some time, and there was clear precedent for using that for not only Mandarin content but also for Cantonese, Wu and other varieties. (In fact, there were registrations under RFC 1766 that explicitly connected several of these to zh.) At the same time, "Chinese" had been perceived as an individual language, not a collection; yet clearly Cantonese etc. are reasonably deemed distinct languages.

> Now, applying to some potential case (framing with certain assumptions): Suppose XXX is used for some developed language Lx and Ly is some less-developed language closely related to Lx; if Ly is being newly coded as YYY, then :

> -          If there is evidence indicating that XXX has been used reasonably widely (by some measure) for Ly, then it may make sense to rescope XXX as a macrolanguage entry that encompasses YYY (and an additional new entry XXX' to denote the developed language Lx proper - exclusive of Ly)
> -          In the absence of such evidence, it probably makes most sense to deem XXX as denoting only Lx, in which case the scope of XXX is unchanged

> Now, if the first case applies, and XXX becomes deemed a macrolanguage, then (and only then) do we here have a potential option to register YYY (and XXX') as an extlang. (Though John has, I think, been arguing that this is a hypothetical option only, and not a real option under an assumed interpretation of RFC 5646.)

> Peter

-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Ietf-languages mailing list