Adding variant subtag 'erzgeb' for Erzgebirgisch
doug at ewellic.org
Tue Aug 11 04:22:00 CEST 2009
"Phillips, Addison" <addison at amazon dot com> wrote:
>> It would not be appropriate to give two prefixes to a variant meant
>> to denote only one dialect, merely to point out that we disagree
>> about what the base language is, or to try to provide perfect
>> linguistic derivations. ... We must pick one.
> This is not true. What the rules prohibit are registrations that
> completely alter the validity of tags or which narrow the range of
> acceptable tags. There is no language in 4646 or in 4646bis requiring
> that we must pick only one.
There's nothing that requires it syntactically. But to add both "sxu"
and "vmf" as Prefix fields, either now or later, would imply that there
(a) an Upper Saxon dialect of Erzgebirgisch ("sxu-erzgeb") and
(b) a Mainfränkisch dialect of Erzgebirgisch ("vmf-erzgeb")
which are close enough that the same meaning of "Erzgebirgisch" applies
to both (as opposed to, say, "western"), but not so close that they are
actually the same dialect. Randy kind of alluded to this.
> Doug, you're correct that all of the current variant subtags that have
> multiple prefixes have a diverse set of prefixes because they denote
> things such as transcription. I think in the 4646bis era (in which we
> have some very fine grained primary language subtags, courtesy of
> 639-3), that we must be prepared for the possibility of registrations
> like this, though, in which different prefixes can be reasonably
> applied to a single variant.
I didn't specifically call out transcriptions, although there probably
aren't many other good examples of language variations that span across
languages but have essentially the same meaning in all of them. (Note
that 'baku1926' was intended as an everyday orthography, not really a
The tags "az-baku1926" and "kk-baku1926" and "tt-baku1926" all refer to
different languages, but the variant subtag 'baku1926' refers to the
same variation in all of them. Same goes for "en-fonipa" and
"fr-fonipa" and "tlh-fonipa". I don't think the same can be said for a
variant subtag that represents a dialect, in the sense I understand the
word, no matter how finely grained the 639-3 code elements are.
Hypothetically only, I suppose a variant like 'babytalk' could be
registered that would have the same meaning across different languages,
but not refer to a writing system. But even then, "sxu-babytalk" and
"vmf-babytalk" would not quite be identical, unless of course the
premise is that 'sxu' and 'vmf' themselves are identical -- in which
case we should be asking ISO 639-3 to merge the two, not talking about
>> By "change the tag" I assume you mean deprecate or remove the Prefix
>> field "sxu" and add the Prefix field "vmf". We cannot remove a
>> Prefix in this way, as it would destabilize the interpretation of
>> content already tagged "sxu-erzgeb". We can only broaden the
>> existing scope of what the Prefix field(s) denote(s) for any given
> This is entirely correct. You could add the prefix 'vmf', however, at
> a later date. You can also add a comment to the record recommending
> the use of 'vmf' over 'sxu' for this dialect. However, a prefix can
> never be wholly removed, as Doug notes.
But even then, "sxu-erzgeb" could not be made canonically equivalent to
"vmf-erzgeb" -- we can't add a Preferred-Value for a Prefix that points
to another Prefix -- so one would not match the other according to the
normal matching rules. It would require a specialized, non-BCP 47
matching engine. Failing that, Erzgebirgisch content tagged as
"sxu-erzgeb" would not be noticed by matching engines looking for
"vmf-erzgeb", and vice versa.
Language tags and subtags need to be linguistically defensible (so
"ja-erzgeb" would be a supremely bad idea), but ultimately their purpose
is identification of content, not scholarship. Any solution involving
two or more alternative tags for the same dialect will dilute the value
of all of them. As always, I suggest we pick one and be done with it.
Doug Ewell * Thornton, Colorado, USA * RFC 4645 * UTN #14
More information about the Ietf-languages