Adding variant subtags 'aluku' and 'nduyka' and 'pamaka' fordialects

Fri Aug 21 04:02:53 CEST 2009

Peter Constable <petercon at microsoft dot com> wrote:

> Do we know all the ways that the LSTR will get used?

Well, of course we don't.  I didn't know, for example, the region 
subtags were expected to be used as a general reference to ISO 3166 
country codes, such that we would have to add 'EU' and the other 
exceptionally reserved codes just to support that use case.

> Will records from the registry, but not the corresponding details in 
> registration forms, get presented to users in some context where they 
> are left to interpret the record without the benefit of the forms? If 
> so, would potential risks warrant a small mitigating addition to the 
> comment field, or are the risks negligible?

I think maybe I didn't communicate clearly why I don't think this is a 
problem that needs to be solved.

There are NO subtags -- or grandfathered or redundant whole tags, for 
that matter -- anywhere in the Registry that represent two or more 
different entities.  I can't think why a user would assume, in the 
absence of direction, that variants should be the lone exception in this 
regard.  Even variants that have multiple Prefix fields (like 
'baku1926') or none (like 'fonipa') refer to the same basic concept, 
merely applied to different base languages.

>>> So, I think there's more potential for misinterpretation of the 
>>> intent for variants than for language, region or script IDs.
>>
>> I think this distinction is important and/or interesting to those of 
>> us on this list and LTRU, but I'm not sure the average user cares.
>
> Well, as well, there are a lot of average users who wouldn't care 
> enough to understand if the subtag was meant to be used for one thing 
> or two.

See above.

>> Type: language
>> Subtag: nl
>> Description: Dutch
>> Description: Flemish
>> Added: 2005-10-16
>> Suppress-Script: Latn
>> Comments: Dutch and Flemish are alternate names
>
> Perhaps a reasonable comparison, though there are differences in the 
> scenarios: not-well-known dialect names for a not-well-known language 
> versus well-known names for a well-known language within a 
> sociolinguistic milieu that has various complexities.

OK, if we want some not-well-known examples I can easily pull out 'bfe' 
(Betaf, Tena) or 'mhe' (Besisi, Mah Meri) or 'tbp' (Taworta, Diebroud) 
or 'Tglg' (Tagalog, Baybayin, Alibata).

I can agree that the Spanish/Castilian identity is well-known to a large 
percentage of people who would use the Registry, and probably the 
Dutch/Flemish identity as well (though I know there are some who claim 
it's not as simple as that), while the Aluku/Boni identity is not at all 
well-known.  But I'm not at all sure where we would draw the line 
between those extremes.

If others agree that there is a genuine risk of Registry users thinking 
that a single variant can refer to two different language variations, 
I'll back off.

--
Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14
http://www.ewellic.org
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages  ˆ