Duplicate Busters: Survey #1

Doug Ewell doug at ewellic.org
Sat Aug 2 03:23:28 CEST 2008

Frank Ellermann <nobody at xyzzy dot claranet dot de> wrote:

> In the edit history I found that somebody had the quite plausible but 
> wrong idea to rename Silesian to "Polish Silesian", and Lower Silesian 
> to "German Silesian".  It was immediately corrected by somebody 
> knowing what it is about.  But it is a good example how "just add some 
> nice qualifier" can miss the point and hit a rat-hole.
> BTW, one advantage of the Wikipedia list is that it has the native 
> names (as far as they are known, and editors agree on what it is, szl 
> is simple, stq is harder), see 
> <http://en.wikipedia.org/wiki/ISO_639:s>

The second paragraph above demonstrates how tempting it is to "improve" 
the names, but the first paragraph shows why it is a such bad idea.

I don't propose substantially changing any Description fields in the LSR 
that are derived from ISO.  Doing so would fall way too far onto the 
side of second-guessing ISO.  What I do propose is to add minor 
annotations like "(Papua New Guinea)" to distinguish two languages that 
have the same ISO name, and to get rid of needless duplicate names that 
add no information.

>> ISO 639-3 makes a distinction between the reference and non-reference 
>> names.  RFC 4646bis does not, although it will non-normatively place 
>> the ISO 639-3 reference name (if any) first within the record.
> The 4646bis proponents could decree that it is normative.

We could, and it has been proposed.  I don't see the value, though.

> Or why not invent a convention to flag "secondary" dupes, e.g., add 
> (*) to "secondary" dupes.  Including the few "dupe by macrolanguage" 
> and "dupe by deprecation" cases.

I don't see how this is clearer for end users, or anybody, than what I 
am proposing.

Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14
http://www.alvestrand.no/mailman/listinfo/ietf-languages  ˆ

More information about the Ietf-languages mailing list