Alemanic & Swiss German
Gerard Meijssen
gerardm at wiktionaryz.org
Wed Dec 6 11:13:32 CET 2006
Hoi,
The OmegaT software (a CAT tool) currently uses JAVA to indicate
languages. As a consequence it is next to useless when languages that
are in the long tail are to be translated. According to Mark's
presentation Google only recognises some 100 languages. ISO-639-3
recognises some 7603. The Wikimedia Foundation supports 250.
When you only work inside what the Standard supports there is no
apparent problem. The problems starts when a Standard does not support a
language.
Thanks,
Gerard
Mark Davis schreef:
> > In a presentation of Google it
> > was suggested that the coding of content with language codes is so
> > unreliable that it is practically useless.
>
> I suspect that this was an impression left by my presentation at the
> Unicode conference. It is true that for web pages, the language
> tagging is pretty minimal (about 15%) and too often incorrect to be
> relied upon. However, that is far from saying that BCP 47 (RFC 4646)
> is useless. It provides a stable, unambiguous, identification system
> for communicating language information between software components.
> Even with web pages, once the language of a web page is heuristically
> determined (and any existing tag can help to break ties), the language
> tag is used internally to communicate with any process that needs to
> deal with that page. And there are many other uses of language tags --
> communicating the user's choice of UI language is an obvious one.
>
> The key issue for web pages in particular is that their producers
> don't immediately see much value in accurate tagging, because the
> consequences of omission are not immediately apparent, and at this
> point at least, not that bad.
>
> Mark
More information about the Ietf-languages
mailing list