Unilingua

Sat Sep 17 05:21:18 CEST 2005

On 9/16/05, Tex Texin <tex at xencraft.com> wrote:
> Even with the small number of codes we have today, I have difficulty
> determining which code properly describes a document. There are no
> guidelines or rules or ways to determine whether a document is one branch of
> a language versus another, except with the crudest of guesses. Various
> experts make pronouncements about Japanese being ja and not ja-jp, or latn
> not being required for en, since en is not generally represented in another
> script, but only an expert knows all of the possibilities and which
> circumstances never (or nearly never) occur, and which ones require
> additional descriptors or not. Given that is the case, I really don't need a
> more refined set of language choices.

I don't see your point here. Whether or not Unilingua is encoded, not
that I think it should, doesn't hurt or help your problems at all.
This new tag, like Boontling, should tag a limited and rather
unambigious set of documents. Adding new languages may increase the
number of tags in the system, but rarely make the system more complex.

Your problems could appear in a system that supported just en and ja,
Latn and Hrkt, and us and jp. If you think there needs to be more
standard rules about when to use script and country entities, that
could be a thread, but it would be a completely different thread from
this.