Formal request for "French (all periods)"

CE Whitehead cewcathar at
Tue Feb 13 21:58:47 CET 2007

Hi, John, all I tend to support the use of a macrolanguage tag to include 
the various historical and other varieties of languages--when several 
varieties are mixed in a document/text in such a way that it becomes almost 
impossible to separate them for some purposes (such as in identifying the 
primary language of the user or document which can then in fact be 
identified as a macrolanguage; though people have suggested that it be 
identified as several languages there are cases where this is absolutely 
however, there are rules about what sort of subtags can be used with a 
macrolanguage tag; by macrolanguage subtag do we mean extended language 
subtag?  There are peculiar rules for the extended language subtag (but I do 
not see any peculiar rules about macrolanguage subtags otherwise; so it's my 
guess that the macrolanguage subtag, if not the same as an extended language 
subtag, could not be used with other primary language subtags such as fro, 
frm, and fr, the macrolanguage subtag would have to be used alone; but 
either a 2- or 3- letter subtag would be o.k.; someone correct me if I am 

The rest of my response is below, in between your text John.
>ISO 639-2 (and a fortiori ISO 639-3) contains identifiers for Old French
>(fro), Middle French (frm), and Modern French (fra), separated by time
>periods.  This subdivision works well for most documents, as the given
>dates correspond roughly to the timeline for major changes in the written
>form of the evolving French language.

You mean (fro), (frm), and (fr), which are indeed separated by time periods; 
fra is out of date I think.

Quoting from other peoples' emails on macrolanguages (Mark Davis, Doug 
Euwell, and ?):

"In particular, there is a key item that we need a response for, and that is 
the temporal scope of ISO language codes. For example, we have french broken 
up into fr, frm, and fro, plus various creoles. So  tagging old French data 
with fr is incorrect, and requesting a variant subtag to apply to fr to 
indicate French of say (1100-1200) is out of scope and will be legitimately 

(I must note that the writers have noted that French is about the only 
language that is broken up by date)

>However, sometimes the exact provenance of a document cannot be
>determined, while internal language features may indicate a language
>variety that bridges the Old/Middle or the Middle/Modern divides.
The macrolanguage tag is also useful for informing machines that the 
languages are mutually comprehensible; I must say Old French is readable 
with difficulty; I have to go over it slowly; my Old Provencal is generally 

And the macrolanguage tag will make it easier to identify Creoles and such 
too (such as Haitian Creole).

Here's what the other emails on macrolanguages said:
"It was proposed that ISO 639-3 should create macrolanguages encompassing 
early, middle, and modern stages of a single language, such as French."
However, the writer added,
"The working definition of an ISO 639-3 macrolanguage is described at and doesn't seem to have anything to 
do with historical changes in a language."

Here is
* * *
"In this example, it may appear that the single identifiers in ISO 639-1 and 
ISO 639-2 should be  designated as collective language identifiers. That is 
not assumed here. In various parts of the world, there are clusters of 
closely-related language varieties that, based on the criteria discussed 
above, can be considered distinct individual languages, yet in certain usage 
contexts a single language identity for all is needed. Typical situations in 
which this need can occur include the following:

"    *  There is one variety that is more developed and that tends to be 
used for wider communication by speakers of various closely-related 
languages; as a result, there is a perceived common linguistic identity 
across these languages. For instance, there are several distinct spoken 
Arabic languages, but Standard Arabic is generally used in business and 
media across all of these communities, and is also an important aspect of a 
shared ethno-religious unity. As a result, a perceived common linguistic 
identity exists.
"    * There is a common written form used for multiple closely-related 
languages. For instance, multiple Chinese languages share a common written 
"    * There is a transitional socio-linguistic situation in which 
sub-communities of a single language community are diverging, creating a 
need for some purposes to recognize distinct languages while, for other 
purposes, a single common identity is still valid. For instance, in some 
contexts it is necessary to make a distinction between Bosnian, Croatian and 
Serbian languages, yet there are other contexts in which these distinctions 
are not discernible in language resources that are in use."
* * *
I do not myself see that the last situation, the transitional 
socio-linguistic situation, might entirely exclude exclude historical 

I personally as I have said above, support the use of a macrolanguage tag to 
include the various historical varieties (but I note that Old French is so 
different from modern French that it is not quite really comprehensible to 
most speakers of the modern variety and I do kind of think the idea behind a 
macrolanguage is that the varieties of it are all comprehensible to speakers 
of any, but I know this is not always the case anyway);
but it seems there are rules about what sort of subtags can be used with a 
macrolanguage tag; by macrolanguage subtag do we mean extended language 
subtag?--about which there are peculiar rules (but I do not see any peculiar 
rules about macrolanguage subtags otherwise); quoting from RFC 4646 :

"2.2.2.  Extended Language Subtags

   The following rules apply to the extended language subtags:

   "1.  Three-letter subtags immediately following the primary subtag are
       reserved for future standardization, anticipating work that is
       currently under way on ISO 639.

   "2.  Extended language subtags MUST follow the primary subtag and
       precede any other subtags.

   "3.  There MAY be up to three extended language subtags.

   "4.  Extended language subtags MUST NOT be registered or used to form
       language tags.  Their syntax is described here so that
       implementations can be compatible with any future revision of
       this document that does provide for their registration.

   "Extended language subtag records, once they appear in the registry,
   MUST include exactly one 'Prefix' field indicating an appropriate
   language subtag or sequence of subtags that MUST always appear as a
   prefix to the extended language subtag.

   "Example: In a future revision or update of this document, the tag
   "zh-gan" (registered under RFC 3066) might become a valid non-
   grandfathered (that is, redundant) tag in which the subtag 'gan'
   might represent the Chinese dialect 'Gan'."

>Classification of such doubtful or ambiguous documents is enhanced
>by providing a macrolanguage code which encompasses French of all time
>periods.  I therefore propose a macrolanguage identifier that encompasses
>fro, frm, and fra.
>The filled-out forms are attached.  I suggest "frz" as the identifier.
>This is by way of a test case; if the RA approves it, I will submit
>similar forms for English, Dutch, (High) German, Greek, Gaelic,
??? where does Provencal/Occitan go?  I understand that it is very close to 
Catalan and Valencian; thus it might go with that group!!!

, and Turkish.
John, are you then going to submit macrolanguage tag proposals primarily to 
tag languages which have different varieties according to time???
I support something to accommodate the varieties of a language over time 
(for cases when there is a mixture of the various varieties that is being 
used) but worry about the problems were are going to get in with the RFC 
4646 syntax--my bet is no one wants to rewrite/redo either the RFC 4646  or 
the various subtags we are now just getting!

--C. E. Whitehead
cewcathar at

Laugh, share and connect with Windows Live Messenger

More information about the Ietf-languages mailing list