"extension language"

Peter Constable petercon at microsoft.com
Fri Dec 11 23:33:12 CET 2009


A caveat regarding certain terminology: In BCP 47, "extended language" and "extension" are distinct concepts, and the only thing in common is that they are both possible elements within a tag. The expression "extension language" is not used in BCP 47 and could create confusion. In our current threads, we are discussed extended language subtags ("extlangs"), not extensions.


Thanks

Peter


From: CE Whitehead [mailto:cewcathar at hotmail.com]
Sent: Friday, December 11, 2009 1:39 PM
To: Peter Constable; ietf-languages at iana.org
Subject: RE: Criteria for languages



Hi!

Thanks to Peter, Doug, John, and Joan Spanne for your replies.

The extension language issue seems a moot point anyway, according to Doug's reading of all this.





However, on a quick check, I note that [vro] was not made an extension language anyway it seems (??  but I am only basing this on Richard Ishida's list of macrolanguages

(http://people.w3.org/rishida/utils/subtags/index.php?find=Estonian&submit=Find) because I am unable to download the registry on my computer).

Richard Ishida's search utility lists 220 extension languages

(http://people.w3.org/rishida/utils/subtags/index.php?list=5&submit=List);

I did not get an accurate count but well over 100 of these are sign languages  (what these have in common is they all use signs--I guess that's something of a common encoding system though the signs vary?);

the next largest group seems to be a family in the Austronesian language group with the prefix ms,

followed by the various Arabic languages (prefix [ar]),  the various Chinese languages (prefix [zh]) (both the Chinese and the Arabic are cases where the written forms of these languages is generally about identical),

2 Indo-European languages from India with the prefix [kok], North and South Uzbek,

and Swahili .

Best,



C. E. Whitehead

cewcathar at hotmail.com<mailto:cewcathar at hotmail.com>


From: petercon at microsoft.com<mailto:petercon at microsoft.com>
Date: Thu, 10 Dec 2009 01:33:42 +0000
> You're trying to think of why it would be useful to use extended language subtags* in some general sense. I was simply trying to account for how they arose and the pre-existing practice that led to them - which didn't necessarily arise because it made particular sense.

> There were long discussions as 4646bis was worked on debating ways in which extlang subtags might be / are / are not useful, and there was no clear consensus on that topic. We'll see how eager people are to take that up again now.


> Peter

From: CE Whitehead [mailto:cewcathar at hotmail.com]
Sent: Wednesday, December 09, 2009 11:34 AM
To: Peter Constable; ietf-languages at iana.org
Subject: RE: Criteria for languages


Hi, Peter,

thanks for your history of the Chinese case.

However, what confuses me is why are some languages made extension languages?

It makes sense to me to make a language an extension language of a macrolanguage if all the extension languages under that macrolanguage are generally written more or less identically (as is the case for the Chinese languages written in Chinese script; similarly in Arabic, standard Arabic is the only written form).  Then it would make sense to allow users to tag these languages using both the macro-language code and the extension-language code--
wherease if the languages appear different in writing, I don't see any reason to tag them with anything but their own unique code, ever.

I don't know if this idea is off-base or not.  If my idea is not off-base, then I would not think that either the Latvian/Latgalian pair or the Lithuainian/Samogithian pair would need to be registered as extension languages once the new codes are in place (assuming these codes will be approved).

Best,

C. E. Whitehead
cewcathar at hotmail.com<mailto:cewcathar at hotmail.com>
From: petercon at microsoft.com<mailto:petercon at microsoft.com>
To: cewcathar at hotmail.com<mailto:cewcathar at hotmail.com>; ietf-languages at iana.org<mailto:ietf-languages at iana.org>
Subject: RE: Criteria for languages
Date: Wed, 9 Dec 2009 07:58:36 +0000
> The basic intent of macrolanguages in ISO 639-3 (and extlang subtags in BCP 47 - the original idea for extended language subtags was related to the origination of macrolanguages) can be understood from the Chinese cases - a prototypical example: zh / zho had been in use for some time, and there was clear precedent for using that for not only Mandarin content but also for Cantonese, Wu and other varieties. (In fact, there were registrations under RFC 1766 that explicitly connected several of these to zh.) At the same time, "Chinese" had been perceived as an individual language, not a collection; yet clearly Cantonese etc. are reasonably deemed distinct languages.

> Now, applying to some potential case (framing with certain assumptions): Suppose XXX is used for some developed language Lx and Ly is some less-developed language closely related to Lx; if Ly is being newly coded as YYY, then :

> -          If there is evidence indicating that XXX has been used reasonably widely (by some measure) for Ly, then it may make sense to rescope XXX as a macrolanguage entry that encompasses YYY (and an additional new entry XXX' to denote the developed language Lx proper - exclusive of Ly)
> -          In the absence of such evidence, it probably makes most sense to deem XXX as denoting only Lx, in which case the scope of XXX is unchanged

> Now, if the first case applies, and XXX becomes deemed a macrolanguage, then (and only then) do we here have a potential option to register YYY (and XXX') as an extlang. (Though John has, I think, been arguing that this is a hypothetical option only, and not a real option under an assumed interpretation of RFC 5646.)


> Peter

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/ietf-languages/attachments/20091211/79e9bfc0/attachment-0001.htm 


More information about the Ietf-languages mailing list