Duplicate Busters: Survey #2

Frank Ellermann nobody at xyzzy.claranet.de
Fri Aug 1 17:27:45 CEST 2008


Doug Ewell wrote:
 
 [set 1]
> The goal is to pick one and discard the other.

For ASCII vs. non-ASCII "spelling" differences I'd
doubt that this is a good goal.

 [set 2]
> the description without comment is the ISO 639-1
> and/or -2 name.

IOW the "relevant" name for all Internet protocols,
Web standards, etc. using RFC 1766, 3066, or 4646
tags.  A quite significant number of existing tags.

> Type: language
> Subtag: ms
> Description: Malay (macrolanguage)
> Description: Malay

If the 4646bis proponents invent some kind of scope
field indicating "macrolanguage" the longer name is
not strictly necessary.

If they'd invent a flag (*) they could even indicate
that this is not the main entry for Malay IFF there
will be a new "individual" Malay dupe.

I prefer the shorter description, assuming that the
"macrolanguage" info is preserved elsewhere in the
hypothetical registry.
 
> Type: language
> Subtag: ain
> Description: Ainu (Japan)
> Description: Ainu

Here the longer description is better.  It might be
good to preserve the currently relevant name somehow,
how about a Comment ?  I skip similar cases.

> Type: language
> Subtag: rup
> Description: Macedo Romanian
> Description: Macedo-Romanian

That is stupid.  Pick the currently registered name,
or convince ISO 638 to toss a coin.  Note what you
have done manually in 4645bis.

> Type: script
> Subtag: Ethi
> Description: Geʻez
> Description: Ge'ez

Keep both as you found them in the sources.
 
> Type: script
> Subtag: Hang
> Description: Hangul
> Description: Hangŭl
> Description: Hangeul
 
> Technically I should not be including Hangeul, which is
> a different transcription of the same Korean word, not
> a genuinely different name. Make your own judgment.

Keep all as you found them in the sources.

 Frank



More information about the Ietf-languages mailing list