Formal request for "French (all periods)"
CE Whitehead
cewcathar at hotmail.com
Tue Feb 13 21:58:47 CET 2007
Hi, John, all I tend to support the use of a macrolanguage tag to include
the various historical and other varieties of languages--when several
varieties are mixed in a document/text in such a way that it becomes almost
impossible to separate them for some purposes (such as in identifying the
primary language of the user or document which can then in fact be
identified as a macrolanguage; though people have suggested that it be
identified as several languages there are cases where this is absolutely
impractical!!);
however, there are rules about what sort of subtags can be used with a
macrolanguage tag; by macrolanguage subtag do we mean extended language
subtag? There are peculiar rules for the extended language subtag (but I do
not see any peculiar rules about macrolanguage subtags otherwise; so it's my
guess that the macrolanguage subtag, if not the same as an extended language
subtag, could not be used with other primary language subtags such as fro,
frm, and fr, the macrolanguage subtag would have to be used alone; but
either a 2- or 3- letter subtag would be o.k.; someone correct me if I am
wrong).
The rest of my response is below, in between your text John.
>
>ISO 639-2 (and a fortiori ISO 639-3) contains identifiers for Old French
>(fro), Middle French (frm), and Modern French (fra), separated by time
>periods. This subdivision works well for most documents, as the given
>dates correspond roughly to the timeline for major changes in the written
>form of the evolving French language.
You mean (fro), (frm), and (fr), which are indeed separated by time periods;
fra is out of date I think.
Quoting from other peoples' emails on macrolanguages (Mark Davis, Doug
Euwell, and ?):
"In particular, there is a key item that we need a response for, and that is
the temporal scope of ISO language codes. For example, we have french broken
up into fr, frm, and fro, plus various creoles. So tagging old French data
with fr is incorrect, and requesting a variant subtag to apply to fr to
indicate French of say (1100-1200) is out of scope and will be legitimately
rejected."
(I must note that the writers have noted that French is about the only
language that is broken up by date)
>However, sometimes the exact provenance of a document cannot be
>determined, while internal language features may indicate a language
>variety that bridges the Old/Middle or the Middle/Modern divides.
>
The macrolanguage tag is also useful for informing machines that the
languages are mutually comprehensible; I must say Old French is readable
with difficulty; I have to go over it slowly; my Old Provencal is generally
better.
And the macrolanguage tag will make it easier to identify Creoles and such
too (such as Haitian Creole).
Here's what the other emails on macrolanguages said:
"It was proposed that ISO 639-3 should create macrolanguages encompassing
early, middle, and modern stages of a single language, such as French."
However, the writer added,
"The working definition of an ISO 639-3 macrolanguage is described at
http://www.sil.org/iso639-3/scope.asp#M and doesn't seem to have anything to
do with historical changes in a language."
Here is http://www.sil.org/iso639-3/scope.asp#M:
* * *
"In this example, it may appear that the single identifiers in ISO 639-1 and
ISO 639-2 should be designated as collective language identifiers. That is
not assumed here. In various parts of the world, there are clusters of
closely-related language varieties that, based on the criteria discussed
above, can be considered distinct individual languages, yet in certain usage
contexts a single language identity for all is needed. Typical situations in
which this need can occur include the following:
" * There is one variety that is more developed and that tends to be
used for wider communication by speakers of various closely-related
languages; as a result, there is a perceived common linguistic identity
across these languages. For instance, there are several distinct spoken
Arabic languages, but Standard Arabic is generally used in business and
media across all of these communities, and is also an important aspect of a
shared ethno-religious unity. As a result, a perceived common linguistic
identity exists.
" * There is a common written form used for multiple closely-related
languages. For instance, multiple Chinese languages share a common written
form.
" * There is a transitional socio-linguistic situation in which
sub-communities of a single language community are diverging, creating a
need for some purposes to recognize distinct languages while, for other
purposes, a single common identity is still valid. For instance, in some
contexts it is necessary to make a distinction between Bosnian, Croatian and
Serbian languages, yet there are other contexts in which these distinctions
are not discernible in language resources that are in use."
* * *
I do not myself see that the last situation, the transitional
socio-linguistic situation, might entirely exclude exclude historical
change!!!
I personally as I have said above, support the use of a macrolanguage tag to
include the various historical varieties (but I note that Old French is so
different from modern French that it is not quite really comprehensible to
most speakers of the modern variety and I do kind of think the idea behind a
macrolanguage is that the varieties of it are all comprehensible to speakers
of any, but I know this is not always the case anyway);
but it seems there are rules about what sort of subtags can be used with a
macrolanguage tag; by macrolanguage subtag do we mean extended language
subtag?--about which there are peculiar rules (but I do not see any peculiar
rules about macrolanguage subtags otherwise); quoting from RFC 4646 :
"2.2.2. Extended Language Subtags
The following rules apply to the extended language subtags:
"1. Three-letter subtags immediately following the primary subtag are
reserved for future standardization, anticipating work that is
currently under way on ISO 639.
"2. Extended language subtags MUST follow the primary subtag and
precede any other subtags.
"3. There MAY be up to three extended language subtags.
"4. Extended language subtags MUST NOT be registered or used to form
language tags. Their syntax is described here so that
implementations can be compatible with any future revision of
this document that does provide for their registration.
"Extended language subtag records, once they appear in the registry,
MUST include exactly one 'Prefix' field indicating an appropriate
language subtag or sequence of subtags that MUST always appear as a
prefix to the extended language subtag.
"Example: In a future revision or update of this document, the tag
"zh-gan" (registered under RFC 3066) might become a valid non-
grandfathered (that is, redundant) tag in which the subtag 'gan'
might represent the Chinese dialect 'Gan'."
>Classification of such doubtful or ambiguous documents is enhanced
>by providing a macrolanguage code which encompasses French of all time
>periods. I therefore propose a macrolanguage identifier that encompasses
>fro, frm, and fra.
>
>The filled-out forms are attached. I suggest "frz" as the identifier.
>
>This is by way of a test case; if the RA approves it, I will submit
>similar forms for English, Dutch, (High) German, Greek, Gaelic,
>Provencal/Occitan
??? where does Provencal/Occitan go? I understand that it is very close to
Catalan and Valencian; thus it might go with that group!!!
, and Turkish.
>
John, are you then going to submit macrolanguage tag proposals primarily to
tag languages which have different varieties according to time???
I support something to accommodate the varieties of a language over time
(for cases when there is a mixture of the various varieties that is being
used) but worry about the problems were are going to get in with the RFC
4646 syntax--my bet is no one wants to rewrite/redo either the RFC 4646 or
the various subtags we are now just getting!
--C. E. Whitehead
cewcathar at hotmail.com
_________________________________________________________________
Laugh, share and connect with Windows Live Messenger
http://clk.atdmt.com/MSN/go/msnnkwme0020000001msn/direct/01/?href=http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=hmtagline
More information about the Ietf-languages
mailing list