Results of Duplicate Busters Survey #2

Frank Ellermann nobody at xyzzy.claranet.de
Sun Sep 7 01:29:23 CEST 2008


Michael Everson wrote:
 
> Frank, please state your preference, so we can be done
> with this.

After reading the follow-up thread:  The UNIQUE concept
wrt descriptions is clear, I support it.

But the form of language tags used for years in Internet
protocols is based on alpha2 OR alpha3 language codes as
found in ISO 639-1 or 639-2, plus the ISO 3166-1 country
codes (until RFC 3066).

RFC 4646 fixed an ugly problem with ISO 3166-1 stability
(using the UN codes as escape hatch), and added scripts
in a backwards compatible way (using the Suppress-Script
kludge).

What you are talking about is a *new *problem if or when
4646bis in essence replaces ISO 639-1/2 by ISO 639-3,
again in a backwards compatible way, hopefully.  But all
existing tags are based on ISO 639-1/2.

Therefore "original" descriptions should be preserved in
the registry somehow, where it is no obvious difference
like adding "(macrolanguage)".

One way to add any non-trivial "original" description is
just copying it.  If that would result in a conflict wrt
UNIQUE another way is to note the "original" description
in a comment.

IMO decreeing that RFC 4646 and older tags are obsoleted
by whatever ISO 639-3 says is no option.  There will be
applications limiting themselves to RFC 4646 languages,
all ISO 639-1/2 warts included.  Interpreting such tags
as defined in ISO 639-3 can result in unclear gibberish.

The registry should offer a hint where that can happen.
Users should not be forced to compare source standards
to figure such oddities out.  The hint can be condensed
in a "(*)" or similar, if 4646bis defines "(*)" to be a
standard "there be dragons" indicator.  Users could then
look up the technical details on SIL or Wikipedia pages.

 Frank



More information about the Ietf-languages mailing list