New item in ISO 639-2 - Zaza

John Cowan cowan at
Thu Aug 24 20:55:15 CEST 2006

Mark Davis scripsit:

> I think you may have gotten the numbering wrong. You say:

Oops, quite right.  "Post in haste, repent at leisure."

> Under #2, all of a sudden 'ams', 'kzg', 'ryn', 'tkn', 'okn', 'ryu', 'xug',
> 'yox', 'mvi', 'rys', 'and' 'yoi' and everything using as the language subtag
> would be deprecated; a big change.

True, and a very good reason why we should stop such things from happening
at the RA/JAC level.

> Under #1, we just add 'rkn' as a code. We would add information to the
> registry that it has become a macro language for 'ams', 'kzg', 'ryn', 'tkn',
> 'okn', 'ryu', 'xug', 'yox', 'mvi', 'rys', 'and' 'yoi'; it is then up to
> people implementing fallback algorithms to add that information, but it
> doesn't cause any subtags to be deprecated.

Currently we have no such facility in the registry and no provision for
it in the matching draft.

> I still don't understand something. Why should the existence of a
> macrolanguage that contains X make the difference between whether "X-...."
> is valid or not? As I understand it, the difference between cmn and hak, for
> example, is no less than the difference between de and nl; why should we say
> that cmn-CN and hak-CN must have the prefix zh-, where we don't require a
> prefix for de and nl?

History.  The tag "zh" has been around a long time, and used to represent
documents (using that term broadly) in all kinds of Sinitic languages.
Furthermore, we already have an existing pattern of use in zh-cmn,
zh-yue, zh-this, zh-that.

What's more, people actually do (and this is critical) refer to the
Sinitic languages as a single language, "Chinese".  Nobody has done
that for German/Dutch since the 17th century; certainly 639-2 is not
likely to introduce such a single-language code element.  (639-2 or
639-5 might conceivably include a language-collective code element for
"Non-Ingvaeonic West Germanic languages", but that wouldn't affect us;
we have no reason to take language-collection subtags into account except
for those provided in 639-2.)

So it's not just that languages are related that causes a macrolanguage
code element to be created, nor even that the cluster of languages
are treated in some contexts as a single language.  It is further a
requirement, at least in practice, that 639-2 has an individual-language
code for it.  It is providing fallback according to the basic matching
algorithm (not refinements of it that depends on having the registry
available at matching time) that is the main benefit of solution #2.

                Si hoc legere scis, nimium eruditionis habes.

More information about the Ietf-languages mailing list