The limit of language codes
Gerard Meijssen
gerardm at wiktionaryz.org
Thu Feb 15 21:17:10 CET 2007
Hoi,
The modern languages have a big advantage. They are the languages that
are spoken today. It is therefore relatively easy to treat them with
bold strokes. When you start drilling down, you can have linguistic
entities that are considered "dialects", you can have different
orthographies. All well and good.
For me languages are a living thing and new words make their appearance
continuously. They are completely apart from how we would like to mark
language usage using meta tags. As words make their appearance, they do
differentiate the language. When we want to mark them with meta data, it
would still be "Dutch" ie nl but the meta data for a movie, a
documentary would still need to include the moment when this particular
recording was created.
In OmegaWiki, we need to tag linguistic entities. For the use of
English, we have decided that when a word is spelled the same in
contemporary en-UK and en-US, we only record it as en. This is
satisfactory for us. Many words have only their use limited in time; who
still thinks and talks of Internet as the "digital super highway"
nowadays ? For a dictionary you identify the dates when they made their
appearance and when they were seen last. When you want to divide a
language in time slots, it is really arbitrary where you create the
lines. Italian is a constructed language, this is also true for German.
Orthographies are a relatively recent invention and consequently it is
not really feasible to create spell checkers before a certain age. An
age that differs per language...
The notion of having tags for historical languages makes sense when
these language are dead. Tagging any other way is at best imprecise. So
please do create a gazillion new tags for historical "languages", I am
not sure that they are worth the paper they are written on. I am also
afraid that they detract from what we have to achieve first; the correct
tagging of content of contemporary material. With only 15% tagged of
material on the Internet, there is plenty of convincing that we need to
do. Convincing that using our tags /is /relevant.
Thanks,
Gerard
More information about the Ietf-languages
mailing list