Variants of Japanese (was: Re: Unilingua)

Doug Ewell dewell at
Mon Sep 19 01:22:22 CEST 2005

Tex Texin <tex at xencraft dot com> wrote:

> We will not have reliably precise tags in a system where you use one
> label while you are ignorant of variations and then once you become
> aware you use a more precise tag.

But this, as you say, is not only a matter for experts, but a matter on
which even the top experts will disagree and may change their minds over

> Laypeople will not know of (nor research) many variants. Members of
> this list will do a better job, but may still be unaware of variants
> for some languages. You really need an expert to say with confidence
> there is only one language without variation in existence.

I see this as an argument for sophisticated fallback matching, but not
as an argument against tagging of regional (or other) variants.  The
alternative is to disallow tagging of variants of all types, and simply
use ISO 639-*, and I doubt that is what you are proposing.

> A lot of people don't realize Canadian French is different from
> France's French.
> Or for that matter that Canadian English is different from American
> and British English.

And in some cases, perhaps even most, the difference doesn't matter.  Is
this message written in en-US or en-CA?  Which words or usage indicate
one or the other?  Is there anything specific to "Southern California
usage" in this message, and if so, would it be appropriate to tag this
message as en-socal if such a tag existed?

> It makes more sense to me to recommend for tagging purposes that
> people be consistent and use region always, to reflect as closely as
> possible the author's language and/or the intended audience, and for
> matching purposes to be as least restrictive as needed. So tag ja-JP,
> but match on ja (or "ja-JP, ja").

Exactly.  If there is a problem with correct but inconsistent tagging,
solve the problem at the matching end, not by making everyone use the
exact same tags -- 'cause that ain't gonna happen.

Doug Ewell
Fullerton, California

