Language for taxonomic names, redux

Wed Mar 1 15:30:23 CET 2017

On 01-03-17 02:24, Andrew Cunningham wrote:
> Regarding, accessibility considerations:
> 
> Its more of a question for WAI
> 
> it would seem accessibility would benefit if:
> 
> 1) the tag exclusively identified what taxonomic system was used, and
> 2) the tag would be sufgicient for software to know what pronunciation
> rules there are
> 
> la-taxon could be too vague to have any accessibility benefit.
> 
> It globally more than one pronunciation exists for such taxonomic terms
> and if other taxonomies need to be supported. It would seem more
> benefical to do the hard yakka and propose an extension mechanism that
> could idemtify which taxonomy system is being used. And what
> pronunciation system should be used.
> 

I am not sure why you are saying this, and you also state:

> But I am not sure such a [s]ystem is needed. You would need to consult accessibility experts.

I most definitely would _not_ consider myself an accessibility expert,
and indeed, I have often thought that it would be helpful if I could get
some members of the W3C WAI list to come over here and participate.

In the meantime, let me try and give a shot at it.

To establish common ground first, and since Wikipedia is so frequently
referred to on this list, I'll quote from their page on "Web
accessibility" (which is what we're discussing here).

See: https://en.wikipedia.org/wiki/Web_accessibility

It says:

> Web accessibility refers to the inclusive practice of removing barriers that prevent interaction with, or access to websites, by people with disabilities. When sites are correctly designed, developed and edited, all users have equal access to information and functionality

Amen to that.

Now, your concerns. You suggest that "la-taxon could be too vague to
have any accessibility benefit", and that this may be the case:

> [if] globally more than one pronunciation exists for such taxonomic terms

The same can be said for many other (real) languages, and yet we tag
them with a "one size fits all" tag.

For example, there is more than one pronunciation for English, but we
tag them all with "en". That is because we're tagging texts, i.e. the
written form of the language. Where the writing differs (and matters),
the tag can be made more specific (en-US vs. en-UK, or en-UK-oxendict),
until it is precise enough for a spell checker to do its job - i.e.
checking the written form.

The pronunciation issues come in when we're talking (pun intended) about
text-to-speech engines.

A TTS converts the written form into sound, for the benefit of the user
that cannot access the written text directly. Conceptually, it's exactly
the same as a human who's reading a text aloud over the phone, or
reading from a story book to a young child that hasn't learned to read
yet. Here too, the listener cannot access the written text directly, and
reading it out to them removes the barrier.

Given that there are different regional accents in many languages, the
sound that is produced from any given text in that language will vary
with the different speakers. Therefore, the user of a TTS engine can
select the accent that is most familiar to him. That way, he can
concentrate on _what_ is being said, instead of having to "decipher" the
sounds first. That is part and parcel of "removing the barrier". In
fact, an unfamiliar accent could actually _add_ an extra barrier to the
spoken form, a barrier that the written form does not have.

Therefore, the developer of a TTS language module can (and typically
does) provide several different "voices", e.g. a Scottish voice, a Texan
voice, an Irish voice, ...

To be able do that, the developer must be familiar with these regional
variations, i.e. he must know how a Scottish person would read an
English text aloud over the phone or to her child, and try to mimic that.

Now, if taxonomic terms are indeed pronounced differently by native
speakers of different languages, that is conceptually no different from
regional accents within any given language. Therefore, it's up to the
developer of the TTS language module(s) - and not the tagger - to
provide the proper sounds. To do that, the developer may want to enlist
the help of a English botanist, or a French zoologist, or a German
anthropologist - unless he's a "taxonomist" himself, of course.

And if accents should matter, he would need a Texan botanist, and an
Irish one, and a Scottish one, and so on. On the other hand, if accents
don't matter, he can back out one level and make do with a "plain"
English botanist, and the same "English taxonomist sounds" can serve for
all his English "voices".

If, on the other hand, it should turn out that these words are
pronounced in the same way by taxonomists all over the world - as has
been claimed - that's even better, because then he can back out one more
level, and the same "la-taxon" module could serve for _all_ language
modules.

Now, if there are indeed differences among botanists worldwide, it is
not unlikely that they (the differences, not the botanists) could be
handled by providing generic rules for the taxonomic names, e.g. where
to put the stress, and then apply the rules for the pronunciation of
vowels in each language (and/or accent) on top of that. I'd bet that
this simple exercise would get you pretty close.

For example, in "Camellia sinensis", the sound of the first -i- in
"sinensis" may vary, but I'd expect the stress to be on the -e-  in any
language.

This is just a guess, based on the limited set of languages that I
myself am somewhat familiar with. But if my guess is wrong, that does
not make the concept invalid. It would simply take more work to "create"
a "cockney botanist", that's all.

But none of this affects the usability of the "la-taxon" tag for
facilitating access through a TTS engine.

In other words: the tagger can make do with a world-wide "la-taxon" tag
to "push" the TTS engine in the right direction, and the user of the TTS
can "pull" it towards him. This is the same as with regional accents,
where the text writer (or tagger) and the listener also have to
"cooperate" in telling the TTS what to do.

Just in case you might be interested in the inner workings of a TTS
engine and/or the issues of providing language support, the "eSpeak"
speech synthesizer website would be helpful. This is an open source
project, and - as is usual in such cases - it has instructions for
people that want to add support for additional languages and/or voices.

   http://espeak.sourceforge.net/

And more specifically, the "Languages" page at:

   http://espeak.sourceforge.net/languages.html

In particular, section 3.4 of that page gives an idea of the issues that
a developer (in this case a blind native English speaker) has to deal
with. You will notice that, for some languages, he has had help from
native speakers, and for others, he's asking for such help.

I should point out that other TTS engines have other ways of generating
sounds, so the eSpeak website is not to be taken as the "ultimate
reference guide to TTS". But it is helpful in understanding the issues
nonetheless.

As an aside, and now that I mentioned this project anyway, allow me to
elaborate on a point I tried to make yesterday, i.e. that it is not
helpful at all to say that "we must not register la-taxon because nobody
would ever add la-taxon support to a TTS engine".

As I said, that may be true for a commercial vendor. But the eSpeak TTS
engine is open source, and as you can see, there are clear instructions
on how to contribute. So all it takes is _one_ taxonomist with the
dedication to spend a weekend (or two) on it. His name may or may not be
Nobody, but he would have added support for la-taxon to a TTS engine
regardless, thereby invalidating the above prophecy.

And because the eSpeak TTS engine is used by - for example - the (free)
NVDA screen reader (https://www.nvaccess.org/), all blind scientists (in
fields that deal with taxonomies) would benefit from his efforts.

Furthermore, there are other open source projects out there, and some
other "Nobody" may add taxon support to them as well, thereby possibly
building on the knowledge that would, by then, have been incorporated
into the eSpeak project.

As "taxon support" expands in the open source world, blind scientists
may switch away in droves from their commercial screen reader. Then, the
IT departments of the institutions they're working at (universities
etc.) may follow suit and install (say) NVDA as standard software,
cancelling any license agreement they may have with the commercial
vendor (if only to prevent their help desks from having to support two
different screen readers).

That in turn may make the commercial vendor sit up and notice, and may
motivate him to add taxon support himself (maybe even by "borrowing"
from eSpeak...).

Of course, that vendor must then try and lure back the lost customers,
and he may in fact curse us for having approved "la-taxon" in the first
place, but that should not concern us.

Secondly, you also suggest that "la-taxon" may be too vague:

> if other taxonomies need to be supported

That, I'd say, has more to do with the _meaning_ of the taxonomic name
than with accessibility.

Yes, I do agree that these names would be a barrier for me and not for a
botanist, but removing that kind of barrier is not a goal of web
accessibility design. It's not because one has "access to information",
that the information also makes sense. Think of web pages that are
written in a foreign language that you don't understand at all. The
_content_ of the page is accessible, the _meaning_ of the content is not.

Furthermore, that particular barrier would _not_ be removed by being
more precise about the particular taxonomy that is used by the author of
the text. I mean, if I don't know what a "Camellia sinensis" is, it is
not be helpful for me to know that this particular ULO ("unidentified
living object") is named according to the taxonomy as introduced by
Roger Sweet in 1818 (ain't Wikipedia wonderful?).

Therefore, one la-taxon tag can rule them all - at least all taxonomies
that are derived somehow from Latin. There is no need for the tag to be
more precise.

There would be a need to be more precise - although not for _web_
accessibility purposes - if a single name would have different meanings
in different taxonomies, but I seem to understand that this is not the
case, and this seems to be by design. So, again, one tag will do.

As an aside, the "la-taxon" tag _could_ in fact be used to remove the
"lack of knowledge" barrier as well.

It could, for example, be used by pre-publication processing software to
expand:

   <span lang="la-taxon">Camellia sinensis</lang>

into:

  <span lang="la-taxon" title="A species of evergreen shrub or small
tree whose leaves and leaf buds are used to produce tea">
      <a url="https://en.wikipedia.org/wiki/Camellia_sinensis">Camellia
sinensis</a>
  </lang>

The "title" could then be used to show the explanation to the reader on
demand, e.g. when the (sighted) user hovers over these words with the
mouse, and the "anchor" would provide a clickable hyperlink to the
Wikipedia article. Screen reader software knows how to deal with these
as well - and in fact they make much more use of the "title" attribute
than a visual browser.

Please don't say that "nobody would ever do that" because I happen to
have written an application that just does that - not for taxonomic
names but for other technical terms that may not be familiar to all of
the intended readership.

All it takes is a database with the words that you want to be "dressed
up" like this and a few lines of code for the processing, plus the
appropriate cues in the text to trigger the preprocessor into action. In
this case, that trigger would be the "la-taxon" language tag.

Note that I'm not saying that - in the case of taxonomic names - it
would be trivial to _fill_ such a database with appropriate content. I
am only saying that, from a technical point of view, it is a piece of
cake to provide the plumbing.

The database that Caoimhín is building (see his message of yesterday)
does not look like a piece of cake either, yet he's doing it.

I hope this was somehow helpful.

Luc