proposed ISO standard for language variations
dzo at bisharat.net
Tue May 10 17:32:45 CEST 2016
Thanks for sharing this Peter.
At first glance this brings to mind the old ISO 639-6, although there
are differences. Any connections between the two we should know about?
Per Yury's comment on 'i+1-th level' subtags, I would add that it does
seem that the project (in the grand sense) of standardization as regards
languages focuses more on distinctions within languages (in the case of
the L2/16-131 proposal, "down to the language variety of an individual
speaker"), the broad utility of which is hard to see, outside perhaps of
description. Not to say such distinctions are not useful - they can be
of course, but where the need arises. So I tend to agree with Yury's
What is missing I think is systematic attention to the '(i-1)th level' -
or perhaps '(i-0.5)th level' - where mutual intelligibility, common
phonetics, similar structure, and shared vocabulary may, and in many
cases does, make linguistic boundaries fade. It is at the this level
that communication happens but mostly outside the description of the
coding system. Yes, macrolanguages are in this space, but my
understanding of that category is that it was forced by the need to
accommodate certain established ISO 639-1/2 categories broader than what
were identified for ISO 639-3, and as such was never extended to other
logical candidates. An example of the latter is Kinyarwanda and Kirundi,
which are close enough that I am told that speakers of one understand
the other, and that a recent job announcement called for a translator of
"Kinyarwanda or Kirundi" (implying a functional equivalence of the two
for whatever their needs were).
Such "neighbor languages," to use the term sometimes applied to
Scandinavian languages, may represent a particular class of
opportunities for language technology, speech recognition, and
localization. Would it help to expand the subtag system such as
proporsed in L2/16-131 to somehow account for these. I.e., that
such-and-such language is not an isolate, but rather interintelligible
(more or less fully, or partially) with a select set of other languages?
Or alternatively, would addition of macrolanguages to ISO 639, and
policies to support that process, be the road to take?
To return to the Kinyarwanda/Kirundi example, would the work of
L2/16-131 be facilitated in such cases by being able to scope out to a
level combining two (or more) separately encoded but very close
languages before identifying 'i+1-th level' linguistic varieties to
describe with subtags?
Finally a note on "down to the language variety of an individual
speaker." I've long thought that a "pointillist" model might be
interesting for describing some kinds of social dynamics, and this
certainly would include language. Not practical for the topic addressed
by this list, but possibly useful for linguistic analysis, if one could
have data that detailed...
On 5/10/2016 7:37 AM, Yury Tarasievich wrote:
> On 09/05/16 21:13, Peter Constable wrote:
>> I believe they want to have a clear model for creating metadata
>> elements or identifiers for all kinds of language variations. Compare
>> that to our current use of variant subtags, which conflates any kinds
>> of distinction _other than_ script or national/super-national
>> regions. If we had requests for hundreds of variant subtags with many
>> having overlapping semantics, we'd have a bit of a mess to sort through.
> The concept itself -- a set of generic 'i+1-th level' subtags
> appliable to any 'i-th level' subtag -- is appealing.
> Hovewer, the implementation would inevitably produce a plethora of
> obscure and hardly usable elements, and the application would hit the
> obstacle of 'pure' types almost never existing.
>> Without speaking for or against the proposed model, I have had
>> numerous linguistics textbooks that did not present anything like this.
> I was thinking about 'any' university-level textbook of 'traditional
> Russian school' of linguistics.
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
More information about the Ietf-languages