Stop requiring endonyms (Was: RFC 4645bis: making 'pes' and'prs'extlangs/// Better use autonyms

Mon Dec 15 04:19:02 CET 2008

From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of Lang Gérard

> It is also plainly understandable that the needs for RFC 4646
> could eventually have some retro-influence about the
> maintenance of ISO 639, but certainly not to the point of
> contradicting the spirit, the history or even the precise
> letter of the text of ISO 639, as it certainly is now the
> case in my opinion. For example, the fact that ISO 639/RA-JAC
> offered a statement about "freezing" ISO 639-1 in the
> interest of RFC 4646 (that is dirtectly in contradiction with
> the letter of the text of this standard) is an example of
> wjhat should certainly not be accepted.

I was not a participant in the JAC at the time that the question of "freezing" ISO 639-1 was discussed, so can't comment on that. I will say, though, that, on the one hand, I don't get the impression that the JAC is guaranteeing never to add new entries to 639-1, and on the other hand, there's nothing in the text of 639-1 that prevents the JAC from taking a very conservative position with regard to additions to 639-1.

> As is the fact that ISO 639-3 is not a bilingually english/french
> face-à-face presented standard (as are ISO 639, ISO 639-1, ISO
> 639-2 and ISO 639-5),

That has nothing whatsoever to do with BCP 47. Nothing at all! It was a consensus decision of ISO TC37/SC2 to have English and French text published separately, accommodating the specific request of certain member national bodies.

> that is a grave nuisance for comprehension, when the english text
> is very ambiguously written about the foindamental question of what
> ido ISO 639-3 code elements represent (language names, as says the
> general title of ISO 639, or directly languages, that shouls not be
> the case),

As for whether the English text is ambiguous on that point, it decidedly is not: "The ultimate objects of identification are languages themselves; language names are the formal means by which the languages denoted by language identifiers are designated." (Clause 4.2)

As for whether or not code elements should represent languages directly, it is evidently your opinion that they should not, but the text of 639-3 was approved by SC2 consensus.

> 2-And, in fact, let me insist that "reference name"is not a
> general concept used by ISO 639...
> ISO 639 uses "original name" as major basis for uits coding-
> representation scheme...
> Only ISO 639-3 uses "reference name"... as (not systematically
> if you carefully read clause 4.1, but certainly the letter of
> this is not to be taken "cum grano salis" !) major  basis for
> its coding-representation scheme.

ISO 639-3 does not claim in any way that the code elements are derived from any name. Names are provided to indicate the semantic, not as a basis for deriving code elements.

> I am obliged to remark that... it is more than astonishing
> tthat thje reference names provided by Ethnologue are almost
> Always only english version (translation ?) of the name of
> the considered language and mnot derived from the genuine
> autonyms.

I believe Ethnologue lists names used in language descriptions. Many of those may be considered English-language names, but I believe more are Romanizations of indigenous names (sometimes autonyms, but sometimes names used in other local cultures). There was no better practical alternative for the initial publication of 639-3 than what was done. If anyone has suggestions for improvements to the names, they can certainly submit suggestions to the RA.

> 3-Considering specifically the ISO 639-3 code element "fas",
> whose "supposed" reference name (because the ISO 639-3/RA,
> whose table has a column whose title is only "language name",
> never answered  when I asked to know if this "Language name"
> was effectively systematically identical with the
> corresponding "reference name")

Actually, if you look at the documentation for the downloadable data files, the format is documented as follows (some columns removed):

CREATE TABLE [ISO_639-3] (
   Id      char(3) NOT NULL,  -- The three-letter 639-3 identifier
   ...
   Ref_Name   varchar(150) NOT NULL,   -- Reference language name
   Comment    varchar(150) NULL)       -- Comment relating to one or more of the columns

The file contains the following entry:

fas     per     fas     fa      M       L       Persian

Here the ID column is "fas" and the Ref_Name column is "Persian".

> is "Persian", there is no
> visible link betwween the chain "Persian" and the chain "fas"
> that "codes the representation of the ["reference"] name".
>
> So,the choice of such a "reference name" is rather curious,
> when "fas" inside ISO 639-3 is identifying the same language
> name as "fa" inside 639-1, whose "indigenous name" is "fârsy",

And which lists English language names "Farsi; Persian".

> that gives an evident visible link between this indigenous
> name and the code element that "codes the representation of
> the language ["indigenous"] name."

I'm not sure if I understand what the overall point is you're trying to make. It seems like you feel it is important that there be a "visible link" between the indigenous name and the corresponding code element -- in other words, that there be mnemonic similarity between a code element and an indigenous name. But ISO 639-3 never claims to provide that; in fact, it explicitly states that it does not do that. Perhaps you think that it *should* do that because that was how ISO 639 was historically conceived, but it was considered not feasible to do so on the scale of languages covered by 639-3, and that is what the SC2 consensus approved.

Peter