Request: Language Code "de-DE-1996"
Thu, 25 Apr 2002 23:01:30 -0100
On 25 Apr 2002 at 14:43, Peter_Constable@sil.org wrote:
> If I understand you, it seems that de-1901/de-1906 is overall the more
> imporant distinction than de-DE/de-CH/etc. -- is that what you're intending
> to convey? If so, that perhaps suggests de-1901-DE/de-1901-CH/etc over
It's hard to say which distinction is more important; importance depends on
For software processing texts, search engines, legal profession, programmers,
journalists, and biblophiles, the de-1901/de-1906 is more important, IMHO.
For many people writing their private webpages, for product and service descriptions
in the fields of foods, cosmetics, gastronomy and travel, the de-DE/de-CH/de-AT is
more important. Again IMHO.
Since I belong more to the first category, my personal preference is the first variant.
In a more general scope, we discuss a *technical* tagging. What are we
tagging for? Who or what will use these tags, and what for?
My thoughts on these questions:
People are more fault-tolerant and flexible than machines, including software.
For human readers, this tagging is an hint, but not necessity; but not all software can
afford to apply a separate heuristic to detect which language and variant is given.
Thus, I tend towards preferring the first variant.
When matching words, the chances you will encounter unknown words are pretty
high - proper names, for example, are not contained within most
dictionaries. So if the languages differ in vocabulary, this will apply as well.
But the orthography relates to the way known words are written. A direct matching
will miss many words when orthography changes it otherwise would have
Not recognising a word you could have recognised is worse in most contexts I can
think of, than not recognising a word you don't know anyways.
metabit * software and networks * heterogenous,distributed,generative
Fon:(+49)228/242488-0 * Fax: (+49)228/242488-7
address: Kurfürsten-11 * D-53115 Bonn * Germany