Results of Duplicate Busters Survey #2

Doug Ewell doug at
Sun Sep 7 08:37:16 CEST 2008

Frank Ellermann <nobody at xyzzy dot claranet dot de> wrote:

> IMO decreeing that RFC 4646 and older tags are obsoleted
> by whatever ISO 639-3 says is no option.  There will be
> applications limiting themselves to RFC 4646 languages,
> all ISO 639-1/2 warts included.  Interpreting such tags
> as defined in ISO 639-3 can result in unclear gibberish.

Let me see if I understand you correctly.

Under ISO 639-2, there is a code element 'ain' for a language called 
"Ainu."  (Of course, as Gérard Lang reminds us, ISO 639 assigns code 
elements to language *names*, but in language tagging we use them to 
represent languages.)

The Language Subtag Registry currently lists "Ainu" as the Description 
field for language subtag 'ain'.

Under ISO 639-3, there are two unrelated languages called "Ainu," one 
spoken in Japan and the other spoken in China.  The code element 'ain', 
which is meant to refer to the same language as evidenced by the tables 
on the ISO 639-3 Web site, is associated with the name "Ainu (Japan)" 
instead of simply "Ainu" because of the presence of the Chinese Ainu, 
which is not in ISO 639-2, probably because of the 50-document rule. 
The Chinese Ainu is represented by 'aib' in ISO 639-3.

In the LTRU project, I have proposed to amend the subtag 'ain' by adding 
the 639-3 name "Ainu (Japan)" and -- the controversial part -- by 
deleting the 639-2 name "Ainu."  My rationale is that continuing to list 
the name "Ainu" for only one of the two Ainu's will create more 
potential confusion than if each Ainu is qualified by country, as ISO 
639-3 has done.

There are 8 other pairs of languages like this.

If I understand you correctly, you are saying that the use of only the 
Description field "Ainu (Japan)" and not "Ainu" will "obsolete" the RFC 
4646 meaning of the tag, that adding the country qualifier "(Japan)" to 
the Description constitutes a reinterpretation of the tag, and that this 
reinterpretation "can result in unclear gibberish."  Is that correct?

Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14  ˆ 

More information about the Ietf-languages mailing list