Last call: Latvian (and Bontok) extlang subtags
Leif Halvard Silli
xn--mlform-iua at xn--mlform-iua.no
Tue Feb 9 12:53:35 CET 2010
From 'extlang on some macrolanguage subtags' to 'changing the scope of
subtags into macrolanguage' ... and back again.
I agree: banning the last option would be simpler _for this group_ and
simpler for _many_ taggers.
*But* it would not be simpler for _all_ taggers. I also not sure if it
would inspire more tagging.
For instance, I have heard from Doug, the argument that one may use
'no' when it is known to be Norwegian but not known to the tagger
whether it is 'nn' or 'nb'. If, as a merely hypothetical example, 'no'
did not exist, then we could get lots of wrong tagging.
I don't know anything about Bontoc, except where it - or its
encompassed languages - is spoken (and in theory, a tagger might not
know more than that either ...) But the result of - hypothetically -
not having a macrolanguage subtag could become more errors in the
tagging - regardless of whether that subtag _today_ (the existing usage
argument) is used for just one - or for several - of the encompassed
languages or not.
What _also_ seems important is how the speakers of - in this case - the
Bontoc languages perceive 'Bontoc'. Do they - at least in some
contexts - perceive it as a name of all the encompassed languages -
even if they surely are able to distinguish between them? If they do,
then even some native speakers could tend to use 'bnc' for any of the
encompassed languages. E.g. the native speakers may not like to tag
their language as something that can be perceived as "not-bnc".
Back to the issue of extlang subtags: On another list, I have
documented, that even for the old tags 'no', 'nn' and 'nb', then
taggers find it natural to do 'no-nn' and 'no-nb'. (To anyone in fear:
No I don't propose that this should be legal.)
I am therefore of the view that unless it can be documented that it
would be counterproductive, then extlang status should in these cases
always be granted to the encompassed languages. And I do not count
"implementations" (as in "operative system" and "computer
applications") as examples of where it could be counterproductive, as
those that oppose macrolanguage scope and extlang status on behalf of
such implementations seems to be a priori against it.
Leif Halvard Silli
"Martin J. Dürst", Tue, 09 Feb 2010 14:31:20 +0900:
> Very good to hear. Would definitely make life quite a bit simpler for us
> and for many taggers.
> Regards, Martin.
> On 2010/02/09 10:58, Peter Constable wrote:
>> On a related note, I'm preparing a doc from the Unicode Consortium
>> to TC 37 covering a couple of areas of concern, one of them being
>> changing the scope of existing coded language entities to
>> macrolanguage: we are proposing principle that this not be done in
>> the future. The basic rationale is as follows:
>> A) it creates representation issues in that (i) we end up with
>> multiple representations (e.g., Latgalian denoted by either "ltg" or
>> "lav"), and (ii) we end up with ambiguous categories (e.g., "bnc"
>> can be used for five distinct things); and
>> B) On the one hand, in a case like Bontoc (at the
>> less-well-documented / less-developed end of the spectrum) there is
>> little previous usage of the candidate ID, so a lower cost to simply
>> adding the new entities with no macrolanguage mappings and simply
>> recommending people use those; and on the other hand, in a case like
>> "Latvian" (at the well-established end of the spectrum) in which
>> there is a lot of existing usage, the vast majority of which will be
>> for the "standard" variety, the change has too much associated risk
>> for widespread implementations which now have to deal with the
>> issues mentioned in (A).
More information about the Ietf-languages