Last call: Latvian (and Bontok) extlang subtags

Leif Halvard Silli xn--mlform-iua at xn--mlform-iua.no
Tue Feb 9 12:53:35 CET 2010


From 'extlang on some macrolanguage subtags' to 'changing the scope of 
subtags into macrolanguage' ... and back again.

I agree: banning the last option would be simpler _for this group_ and 
simpler for _many_ taggers. 

*But* it would not be simpler for _all_ taggers. I also not sure if it 
would inspire more tagging.

For instance,  I have heard from Doug, the argument that one may use 
'no' when it is known to be Norwegian but not known to the tagger 
whether it is 'nn' or 'nb'. If, as a merely hypothetical example, 'no' 
did not exist, then we could get lots of wrong tagging.

I don't know anything about Bontoc, except where it - or its 
encompassed languages - is spoken (and in theory, a tagger might not 
know more than that either ...) But the result of - hypothetically - 
not having a macrolanguage subtag could become more errors in the 
tagging - regardless of whether that subtag _today_ (the existing usage 
argument) is used for just one - or for several - of the encompassed 
languages or not.

What _also_ seems important is how the speakers of - in this case - the 
Bontoc languages perceive 'Bontoc'. Do they  - at least in some 
contexts - perceive it as a name of all the encompassed languages - 
even if they surely are able to distinguish between them? If they do, 
then even some native speakers could tend to use 'bnc' for any of the 
encompassed languages. E.g. the native speakers may not like to tag 
their language as something that can be perceived as "not-bnc".

Back to the issue of extlang subtags: On another list, I have 
documented, that even for the old tags 'no', 'nn' and 'nb', then 
taggers find it natural to do 'no-nn' and 'no-nb'. (To anyone in fear: 
No I don't propose that this should be legal.) 

I am therefore of the view that unless it can be documented that it 
would be counterproductive, then extlang status should in these cases 
always be granted to the encompassed languages. And I do not count 
"implementations" (as in "operative system" and "computer 
applications") as examples of where it could be counterproductive, as 
those that oppose macrolanguage scope and extlang status on behalf of 
such implementations seems to be a priori against it.

Leif Halvard Silli

"Martin J. Dürst", Tue, 09 Feb 2010 14:31:20 +0900:
> Very good to hear. Would definitely make life quite a bit simpler for us 
> and for many taggers.
> 
> Regards,   Martin.
> 
> On 2010/02/09 10:58, Peter Constable wrote:

>> On a related note, I'm preparing a doc from the Unicode Consortium 
>> to TC 37 covering a couple of areas of concern, one of them being 
>> changing the scope of existing coded language entities to 
>> macrolanguage: we are proposing principle that this not be done in 
>> the future. The basic rationale is as follows:
>> 
>> A) it creates representation issues in that (i) we end up with 
>> multiple representations (e.g., Latgalian denoted by either "ltg" or 
>> "lav"), and (ii) we end up with ambiguous categories (e.g., "bnc" 
>> can be used for five distinct things); and
>> 
>> B) On the one hand, in a case like Bontoc (at the 
>> less-well-documented / less-developed end of the spectrum) there is 
>> little previous usage of the candidate ID, so a lower cost to simply 
>> adding the new entities with no macrolanguage mappings and simply 
>> recommending people use those; and on the other hand, in a case like 
>> "Latvian" (at the well-established end of the spectrum) in which 
>> there is a lot of existing usage, the vast majority of which will be 
>> for the "standard" variety, the change has too much associated risk 
>> for widespread implementations which now have to deal with the 
>> issues mentioned in (A).


More information about the Ietf-languages mailing list