Language for taxonomic names, redux
doug at ewellic.org
Thu Feb 23 18:17:21 CET 2017
Andy Mabbett wrote:
>> "Which language are taxonomic names "divisions or variations within"?
>> As I noted all those years ago, they're not Latin, and not English.”
>> Then you’re not looking for a subtag. That won’t fly.
> The subtag suggestion was Doug Ewell's; I asked, initially (in 2003)
> for "a language code for marking up the taxonomic names of living
> things",, and in my first post on this thread, for "suggestions as to
> how [how to indicate the language of these names] might finally be
Subtags are what we do here.
If you have some linguistic content that you want to tag, and there
isn't currently a (non–private-use) way to tag it, your choices for
1. Create a primary language subtag.
This requires asking ISO 639-3/RA first, waiting for them to say no, and
then persuading the Reviewer and group to overrule the RA and declare
that they were wrong, this really is a language. For one thing,
essentially nobody wants to go this route, and for another, there seems
to be scant agreement that this is a language.
2. Create a variant subtag.
This means the content is in some variant form of an existing language.
In our case, the tagged content would indicate it's a form of Latin or
English or Xaasongaxango or something. The registration itself specifies
a Prefix, which is the language (plus any other necessary subtags) that
this variant can be a form of. There can be more than one Prefix (see
'baku1926'), meaning the variant could apply to any of them, or none
(see 'fonipa'), meaning it could apply to any language at all. That last
case has to be employed carefully; it can't be a way of avoiding the
"what language is this" discussion.
3. Use an existing "special" tag.
There are language subtags like 'zxx' for "No linguistic content/not
applicable" or 'mis' for "Uncoded languages" that can be used for
tagging miscellaneous stuff. These don't provide any real information at
a glance, and come with tons of caveats that basically say "Don't do
4. Create an extension.
This usually means you are trying to tag something that is outside the
realm of language per se, such as collation or translation information.
It's arguable that snippets of specialized terminology fall into this
category, but creating an extension is an expensive process: you have to
write an Internet-Draft and shepherd it through IETF ("submit" is not
the half of it) until it is approved as an RFC.
Of course, there's also the private-use option. All of the content we
have been talking about could be tagged today as "x-taxon" with no
effort from us. But spell checkers and TTS engines won't honor it.
BCP 47 as we know it today didn't exist in 2003. Back then, the request
would have been for a whole tag, such as "la-somethng". One thing we
have never been able to do is create a new two- or three-letter language
tag or subtag on our own; we have always respected ISO 639's claim to
Doug Ewell | Thornton, CO, US | ewellic.org
More information about the Ietf-languages