[Ltru] How to handle macrolanguage when no code?

Sun Apr 12 18:01:14 CEST 2009

Thanks to all who replied on this question with suggestions, additional questions, and pointers.

I will try (to find the time) to get an answer from BBC on their approach and intentions, and also to get some feedback from someone familiar with Kinyarwanda and Kirundi.  This sort of situation is one that I think is potential with a number of languages (per some past threads), and that in such cases, the idea of a clear-cut single language definition and/or audience for page content may not hold. More information on such situations is will certainly become available as more web content in diverse languages is created.

As for requesting macrolanguage codes, that is another level, but obviously one to keep in mind. I think it is viable in many circumstances, but in others it may be difficult to make the case. The ad hoc way that ISO 639 evolved, however, means that there are similar cases of related tongues that are sometimes given a common code (interpreted after the fact as macrolanguage) and sometimes not.  I think that developments such as more web content in diverse languages and efforts such as the locales sub-project of ANLoc (African Network for Localisation) have the potential to highlight such issues. 

Thanks again and all the best.

Don

From: Peter Constable [mailto:petercon at microsoft.com] 
Sent: Wednesday, April 08, 2009 11:12 PM
To: Phillips, Addison; Don Osborn; 'LTRU Working Group'; 'IETF Languages Discussion'
Subject: RE: [Ltru] How to handle macrolanguage when no code?

If it is content in one linguistic variety and crafted to serve two audiences deemed in 639-3 to be distinct languages, then that strikes me as a potential macrolanguage scenario. 

One key question is how narrow a scope of content is needed and how much deliberate effort is needed to craft something like that. For instance, a document consisting of “Papa!” can serve many different audiences, but that is solely because the scope of content is so constrained, and for that reason the bar is not met for a macrolanguage. But if it’s easy for a content provider to come up with content that serves both, then that’s interesting.

Another key question is why that content is functional for both audiences. Is it because it is expressed in a variety that can really be considered common, or is it because it’s actually in language A and 90% of speakers in language B are functionally bilingual in A? Does the common-identify label reflect actual linguistic commonality, or is it a logistic tool used in the repository to reflect merely a dual tasking?

Some thoughts. Discuss it with the 639-3 RA.

Peter

From: ltru-bounces at ietf.org [mailto:ltru-bounces at ietf.org] On Behalf Of Phillips, Addison
Sent: Wednesday, April 08, 2009 5:53 PM
To: Don Osborn; 'LTRU Working Group'; 'IETF Languages Discussion'
Subject: Re: [Ltru] How to handle macrolanguage when no code?

HTML certainly allows you to declare that some content is applicable to more than one language audience. See:

   http://www.w3.org/TR/i18n-html-tech-lang/#ri20040728.121358444

Otherwise, John Cowan’s advice seems appropriate… ISO 639-3 or ISO 639-5 would be your next stop. Note that macrolanguages are sometimes problematical, so you might also consider a collection code instead.

Addison Phillips

Globalization Architect -- Lab126

Internationalization is not a feature.

It is an architecture.

From: ltru-bounces at ietf.org [mailto:ltru-bounces at ietf.org] On Behalf Of Don Osborn
Sent: Wednesday, April 08, 2009 4:40 PM
To: 'LTRU Working Group'; 'IETF Languages Discussion'
Subject: [Ltru] How to handle macrolanguage when no code?

In looking at the BBC website's offerings in African languages, one notes that they have grouped Kinyarwanda and Kirundi together under http://www.bbc.co.uk/greatlakes/  . This makes sense from a linguistic point of view since as I understand it, the two languages are almost the same. When looking at the view (page) source, one notes that they use lang="rw" (for Kinyarwanda). It may be that the pages I checked are properly Kinyarwanda and an expert would know that they are not Kirundi (rn), but it is in any event true that there is no code element to cover both languages.

I'm curious if there is any other recommended way to handle such a situation where web content may be deliberately and easily designed to cover more than one language as defined by ISO 639 when there is not currently any macrolanguage code for them. Could one for example define a whole page as having two languages? E.g., something like lang="rw, rn"?

Thanks in advance for any feedback.

Don

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/ietf-languages/attachments/20090412/6832ba34/attachment.htm