The limit of language codes

Fri Feb 16 15:31:45 CET 2007

On 2/15/07, Gerard Meijssen <gerardm at wiktionaryz.org> wrote:
> When you want to divide a
> language in time slots, it is really arbitrary where you create the
> lines.

Just like dialects.

> The notion of having tags for historical languages makes sense when
> these language are dead. Tagging any other way is at best imprecise.

I don't see why drawing a line between Old English and Middle English
would be any more or any less complex if English were dead. Or even
Middle English and English, since all the complexity is in a
relatively small set of documents around 1500.

> I am also
> afraid that they detract from what we have to achieve first; the correct
> tagging of content of contemporary material. With only 15% tagged of
> material on the Internet, there is plenty of convincing that we need to
> do. Convincing that using our tags /is /relevant.

What we have to do first? This is not a missionary group. My primary
goal is to create a set of language tags usable for Project Gutenberg,
for which also having them supported by XML and other people is a
great help. A large percentage of the books I personally do for
Project Gutenberg date back before 1700, whether in modern editions or
original facsimiles. That's what I'm most concerned about tagging. The
use of these tags by other organizations and standards using language
tagging is very convenient, because having one standard makes things
easier for me. As to whether webpages are tagged, that's Google's
problem; I could care less. I'm not here to achieve that, and I
suspect many others aren't either.