sgn-MT [et al.]: new RFC 3066 tag[s]

Doug Ewell dewell at adelphia.net
Mon Oct 4 04:34:22 CEST 2004


Michael Everson <everson at evertype dot com> wrote:

>> Indeed, registering tags like this means that consumers are forced
>> to look in the registry, even for a tag that conforms to the normal
>> generative mechanism, because the registration may have given the
>> tag a different meaning.
>
> That is what the registry is for. I know, all this revision is
> supposed to make it unnecessary but in this case any way

These are RFC 3066 registrations, of course.  RFC 3066 talks about
registering additional "information about a tag defined by this
document" (Section 3).  RFC 3066bis removes this wording, because the
3066bis registry is for subtags, not whole tags.  You're only supposed
to have to look in the 3066bis registry to see if a whole tag is
grandfathered from 3066, and that set is not supposed to grow.

> "Sign languages as used in the United States" does not refer to a
> language. It is an error. There is no such "language" as "Sign
> Language".

There is no such language as "Australian languages" or "Tupi languages,"
but those tags are allowed by RFC 3066 too.  Even "und" and "mul" are
allowed, though discouraged.

>> Why shouldn't these be registered under RFC 3066 as:
>>
>> sgn-mdl    for Maltese Sign Language
>> sgn-tss    for Taiwanese Sign Language
>> sgn-fse    for Finnish Sign Language
>> sgn-xml    for Malaysian Sign Language
>
> Because that doesn't make sense to the existing user community who
> has been using, and wishes to continue to use, the scheme as
> developed.

I wasn't aware that there was a sizable user community using RFC 3066 to
tag sign-language data for computer applications.  And I thought this
sgn-* extension scheme was developed by Michael, not an existing user
community.  My errors.

>> Such constructions are at least as much in keeping with the RFC 3066
>> registration procedure as the language-plus-country "informative"
>> registrations, possibly more so.  There is already a precedent for
>> registering tags composed of an ISO 639 code plus an invented
>> 3-letter subtag: no-bok, no-nyn, zh-gan, zh-min, zh-wuu, and zh-yue.
>
> "xml" isn't very informative. ;-P

I smiled too when I saw that "xml" was a proposed ISO 639-3 code.  But I
thought everyone had been saying that mnemonicity of subtags was
explicitly not a goal.  That's why we're OK with using "bla" for Siksika
and "mus" for Creek as it is.

>> The great benefit of this approach would be that these subtags (mdl,
>> tss, fse, xml) are already the codes listed in the ISO 639-3 draft
>> for these sign languages...
>
> Benefit to whom?

To implementors of RFC 3066bis and RFC 3066ter.  This is, after all, a
tagging standard for computerized applications.  Is it really now a
significant goal for the tags to be human-friendly?

>> I strongly support using "sgn-" together with the proposed ISO 639-3
>> codes to create any new sign-language tags.  I volunteer to help put
>> together the registration forms, if desired.
>
> Then you would want to deprecate "sgn-IE" in favour of "sgn-isg", and
> I guess you will later want to deprecate that in favour of "isg"?

The first, yes, I guess so.  I still don't know how one "deprecates" a
grandfathered tag.  i don't see any mechanism for doing so

The second, I have no idea, because that depends on how Addison and Mark
intend for extended-language subtags to work.  I know the main idea is
to use them for individual languages that are part of what ISO 639-3
calls a "macrolanguage."  Sign languages are part of a "collective
language" group, which is different.  I don't know whether the intent
was to encode Dominican Sign Language as "sgn-doq" or as simply "doq"; I
assumed it was the former.

But if "isg" is to be an ISO standard code for Irish Sign Language, is
it really such a bad thing for it to be the standard language tag as
well?

> And what about Signed Spoken Languages? Those also MUST have country
> codes. "sgn-eng-IE" is different from "sgn-eng-US" and *very*
> different from "sgn-eng-GB". All of those are representations of
> spoken English.

I think there should be an RFC 3066bis variant subtag, "-signed", to
cover this case.  Michael and others have already stipulated that signed
spoken languages are more like signed manifestations of a spoken
language than like true sign languages.  Signed U.S. English would be
"en-US-signed".  That should be mnemonic enough.  Or is the user
community already using "sgn-eng-US"?

> I do not relish explaining the proposed changes to the user
> community, which helped develop this scheme. I would very much like
> the scheme there to be adopted as a special case for "sgn" in RFC
> 3066bis.

I see from "peeking ahead" on the Web-based archive (I'm a digest
subscriber) that Addison has pretty much agreed to support this
exception, so I won't argue further against it.  But it needs to be a
completely separate syntax, like the "privateuse" and "grandfathered"
productions in the ABNF.  It will not fit into the general RFC 3066bis
syntax.

Addison wrote:

> Now: the question is what form the exception takes in 3066bis. If we
> define the sgn-* tags to be a special case, do we require that they be
> registered as whole tags before use? Or do we allow the generative
> mechanism to define the tags and permit special registrations of an
> informative nature? IOW, is sgn-AQ a "valid" tag?

These absolutely have to be registered as whole tags.  If they could be
defined using the generative mechanism, we would just use the existing
generative mechanism and be through with it.  "sgn-AQ" could not be a
valid tag unless there were an identified "Antarctican Sign Language"
(which might come in handy when communicating at -40° through 80 mph
winds).

"sgn-US" is different from "sgn-US-MA" -- dramatically different, in
fact -- so it cannot be permissible to take the concept of
language-range as applied to normal tags and apply it to sign-language
tags.  You cannot add variant subtags, like "sgn-US-boont" or
"sgn-GB-scouse", since those would be separate languages.  And you
certainly could not have "sgn-IE-Latg" or "sgn-GB-oed" or "sgn-DE-1996".
Again, this exception needs to be treated just like the grandfathered
tags, except that the set is not closed.

The ABNF might be modified to include something like the following:

Language-Tag = ...
                / signlanguage      ; sign-language registrations
...
signlanguage    = "sgn-" (2ALPHA *("-" (1*3alphanum)))

thus reflecting the scheme adopted by the user community.  The
individual subtags should not be assigned specific meanings by the RFC;
that is up to each specific registration.  (They don't always have the
same meaning in the proposed tags.)  The use of 2ALPHA here assumes that
this syntax would only be used for true sign languages, while the
"-signed" variant subtag (not the "sgn-eng-US" syntax) would be used for
signed spoken languages.

How's that?

-Doug Ewell
 Fullerton, California
 http://users.adelphia.net/~dewell/




More information about the Ietf-languages mailing list