Language for taxonomic names, redux

Luc Pardon lucp at
Wed Mar 1 18:00:36 CET 2017

On 28-02-17 18:57, Peter Constable wrote:
>> it should be clear already that Latin words are not English words
> But it is not clear whether words should be considered Latin or English words borrowed from Latin. E.g., several dictionaries (e.g., Merriam-Webster, Cambridge, Oxford) consider "homo sapiens" to be English. 

   As a matter of fact, "homo sapiens" is also in the Dutch "red book"
(the official 2005 spelling wordlist). It even lists the plural
("homines sapientes"). Same for "homo erectus" (homines erecti) and
"homo ludens" (homines ludentes).

  So all of these are - at least officially - Dutch words.

  On the other hand, "Homo neanderthalensis" is _not_ in the list, and
therefore it is _not_ a Dutch word (but "neanderthaler" is, of course).

> I don't know of a clear basis for drawing a line between code
switching for technical terminology (switch to Latin for taxonomic
labels) or borrowings (these taxonomic labels have been in use in
technical contexts for so long that they get taught without distinction
from English-origin technical terms).

  I take it that the above is not meant as a "significant objection"
against the registration of a subtag for taxonomic names ?

  Because if it is, I'd have to ask - with Michael - how you propose to
handle "Camellia sinensis" and the many thousands of other terms that I
do not expect to be in any (non-specialized) dictionary of the English

  Besides, I'd venture that your problem (of "not having a clear basis
for drawing a line") is caused by the fact that English does not have a
single authoritative body to standardize the spelling, and therefore one
has to consult several dictionaries. I would agree that this is "vague".

  However, other languages, such as Dutch, _do_ have official wordlists
and thus there _is_ a clear basis for drawing the line between foreign
words and borrowings.

  The fact that this is not possible in English must not weigh on the
decision to register (or not) a subtag that is intended for use by
scientists worldwide.

  Also, the "problem" is not limited to tagging. A taxonomic name has
capitalization rules that are different from a borrowing. How is an
English spell checker supposed to "draw the line"?

  Even so, as to tagging "homo sapiens", I'd agree with Michael that it
depends on the context, even in Dutch.

  * In a Dutch anthropological journal, I'd expect it to be tagged as
"Latin", along with any other taxonomic names in there. Besides, I'd
expect it to have an initial capital, and that would be flagged as an
error by a spell checker if it wasn't tagged (or it may try to be smart
and insert the "forgotten" full stop after the preceding word...).

  * In a Dutch non-scientific text, I'd _probably_ not tag it (and apply
the standard capitalization rules, i.e. all lowercase when in mid-sentence).

  From an accessibility viewpoint, that would be perfectly acceptable.
That is, because "homo sapiens" is in the list, I would expect a
Dutch-speaking screen reader to speak "homo sapiens" in an
understandable way, even when it is not tagged.

  But again, taxonomic words that are manifestly _not_ part of the
"surrounding" language (be it Dutch or English or whatever), must be
able to be tagged as such, so that the screen reader knows to switch
into (pig) Latin mode.

  Imagine what would happen if "homo sapiens" was _not_ in the Dutch
wordlist, and if it was not tagged as Latin either.

  Any sighted Dutch speaker would recognize it as Latin-like anyway, and
pronounce it as "sa-pi-ens", with the stress on the first syllable (the
way our Latin teacher would have pronounced it).

  A screenreader would not have such insight, and therefore it would
apply the regular pronunciation rules for Dutch (as it would do for any
word that is not found it its database, such as proper names). Therefore
it would blithely utter "sa-piens", where the -ie- is pronounced as a
longer version of a single -i- (sounding more or less like an English
chicken - the word, I mean) and it also would put the stress on the last

  It would take a lot of creativity to reverse-engineer that strange,
unfamiliar sound back into the original word.

  To prevent the screen reader from doing that, one would either have to
write it as "homo sapiƫns" (thereby confusing the sighted reader), or
one could tag it with la-taxon (thereby telling the TTS that it has to
talk like a Roman).

  I am using "homo sapiens" as an example, hoping that the above
description is clear, even to non-Dutch speakers, but the same reasoning
would apply to any and all taxonomic names that are not in the official

  I also hope that it is obvious that this is not acceptable, and that
the tag is definitely needed to provide guidance to screen readers.


More information about the Ietf-languages mailing list