Criteria for languages

CE Whitehead cewcathar at
Wed Dec 9 20:33:30 CET 2009

Hi, Peter, 


thanks for your history of the Chinese case.


However, what confuses me is why are some languages made extension languages?


It makes sense to me to make a language an extension language of a macrolanguage if all the extension languages under that macrolanguage are generally written more or less identically (as is the case for the Chinese languages written in Chinese script; similarly in Arabic, standard Arabic is the only written form).  Then it would make sense to allow users to tag these languages using both the macro-language code and the extension-language code--

wherease if the languages appear different in writing, I don't see any reason to tag them with anything but their own unique code, ever.


I don't know if this idea is off-base or not.  If my idea is not off-base, then I would not think that either the Latvian/Latgalian pair or the Lithuainian/Samogithian pair would need to be registered as extension languages once the new codes are in place (assuming these codes will be approved).




C. E. Whitehead

cewcathar at
From: petercon at
To: cewcathar at; ietf-languages at
Subject: RE: Criteria for languages
Date: Wed, 9 Dec 2009 07:58:36 +0000

> The basic intent of macrolanguages in ISO 639-3 (and extlang subtags in BCP 47 – the original idea for extended language subtags was related to the origination of macrolanguages) can be understood from the Chinese cases – a prototypical example: zh / zho had been in use for some time, and there was clear precedent for using that for not only Mandarin content but also for Cantonese, Wu and other varieties. (In fact, there were registrations under RFC 1766 that explicitly connected several of these to zh.) At the same time, “Chinese” had been perceived as an individual language, not a collection; yet clearly Cantonese etc. are reasonably deemed distinct languages.
> Now, applying to some potential case (framing with certain assumptions): Suppose XXX is used for some developed language Lx and Ly is some less-developed language closely related to Lx; if Ly is being newly coded as YYY, then :
> -          If there is evidence indicating that XXX has been used reasonably widely (by some measure) for Ly, then it may make sense to rescope XXX as a macrolanguage entry that encompasses YYY (and an additional new entry XXX’ to denote the developed language Lx proper – exclusive of Ly)
> -          In the absence of such evidence, it probably makes most sense to deem XXX as denoting only Lx, in which case the scope of XXX is unchanged
> Now, if the first case applies, and XXX becomes deemed a macrolanguage, then (and only then) do we here have a potential option to register YYY (and XXX’) as an extlang. (Though John has, I think, been arguing that this is a hypothetical option only, and not a real option under an assumed interpretation of RFC 5646.)
> Peter
-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Ietf-languages mailing list