Suggestion: Tag or Sub- tag for Scientific names
john at mitre.org
Mon Feb 3 11:24:18 CET 2003
> From: "Jon Hanna" <jon at spin.ie>
> 1. The binominals are part of a agreed-on ontology and hence are to
> some degree comparable to identifiers like URIs, ISBNs or UUIDs.
> 2. The binominals are internationalised by use of a language other
> than that of any given commentator.
> The second is where we may or may not have a case for using a language
> tag. As such we have 3 options:
> 1. Do not mark the binominal as being of a different language to any
> other text (i.e. "troglodytes troglodytes is very small" is English in
> its entirety and "troglodytes troglodytes est tres petit" is, no doubt
> flawed, French in its entirety).
> 2. Use "la" for the binominal.
> 3. Use "la-sci" for the binominal.
> Solution 1 lacks all information on the use. How is a system to know
> how to pronounce the term? How is a user to know how to trace the
> term. "Troglodytes troglodytes" is not English.
This is true for URLs and your other identifier examples.
"http://www.troglodytes.com/ is a great site" may not be "English in
its entirety", but either way we don't need a =language= tag for URIs.
As for pronunciation, I assume that tagging as la-sci will not help
with "Brachypelma albopilosum" (the mixed-Greek example) or such
neologisms as "Polemistus chewbacca". Nor will it help with spelling
correction. It's also worth noting that several developing W3C
standards provide markup for pronunciation and other related
processing, for example, http://www.w3.org/TR/speech-synthesis/.
These code-switching and neologistic examples are excellent evidence, I
think, that these kinds of phrases are not, in fact, a single language,
and so they should be tagged in idiosyncratic ways. Some of them are
indeed lang="la", some are other languages, or a mix of several, or "no
language", as Martin suggests. Some should use other kinds of markup
to indicate whatever is necessary about pronunciation, spelling, etc.
It would be nice to offer a one-stop solution in the form of a new
language tag, but I just don't think it makes sense.
- John D. Burger
The MITRE Corporation
More information about the Ietf-languages