Retired 639-3 codes

Michael(tm) Smith mike at w3.org
Sat Dec 12 04:44:10 CET 2009


Kent Karlsson <kent.karlsson14 at comhem.se>, 2009-12-11 13:19 +0100:

> As for Wikipedia, the conformance to IETF language tags for
> Wikipedia "labels" is far from complete.

I think the WikiMedia Foundation Language Committee could benefit
from having a knowledgeable person in the i18n standards community
take some time to talk with them and give them some guidance in
this area.

  http://meta.wikimedia.org/wiki/Language_committee

I could attempt to try to help them more myself, but at this point
it'd be kind of case of the blind leading the blind.

As far as how to contact them, they do seem to have a mailing list:

  https://lists.wikimedia.org/mailman/listinfo/langcom-l

...though I find that the archives are not public, and joining the
list requires moderator approval.

I did find this related ietf-languages thread from a couple years
back:

  http://www.alvestrand.no/pipermail/ietf-languages/2006-November/thread.html#5185

Thread initiated by Gerard Meijssen from Wiktionaryz.org. It seems
like the feedback he got from the list at that time was that they
should not be using "map-bms", etc., but for whatever reason it
seems like they didn't take the advice.

> For instance:
> simple, bat-sng, roa-tara, roa-rup, fiu-vro, map-bms, zh-classical,
> and cbk-zam aren't IANA language tags. Here I'm just picking those
> that stand out clearly.

They are aware of a number of those and they maintain a list here:

  http://meta.wikimedia.org/wiki/Language_code#Subdomains_that_do_not_conform_valid_ISO_639_language_code

And here:

  http://en.wiktionary.org/wiki/Wiktionary:Wikimedia_language_codes

...where it says, "However, not all languages are coded by ISO.
When new language editions of Wikimedia projects are introduced
and there are no ISO codes for them, WMF Language Committee
creates their own language codes." Which obviously is not a sound
policy for them to be using.

Anyway, on the http://wikipedia.org/ home page at least, instances
of any of those that were in "lang" attribute values on that page
were changed a couple of days ago to, e.g., "map-x-bms".

I know that's not the right solution for the long term, but the
were changed as a result of a discussion that I had on IRC with a
Wikimedia developer who was working on trying to make sure that
Wikipedia home page would validate -

  http://krijnhoetmer.nl/irc-logs/whatwg/20091210#l-492

I suggested that the right solution would be for Wikimedia to go
through the registration procedure for registering actual new tags
for all the cases of existing Wikipedias that don't have
corresponding language subtags in the IANA registry, but I also
pointed out that using "map-x-bms", etc., for the lang values in
the mean time would prevent them from being reported as errors.
Maybe that's not even an appropriate solution for the short term,
but I didn't know what else to suggest to him -- and now in
reading through the archives, I see that's what Doug Ewell seems
to have suggested in his reply on the thread I cited earlier:

  http://www.alvestrand.no/pipermail/ietf-languages/2006-November/005199.html

> Another example is that "arc" (639: Imperial Aramaic, used 700-300 BCE)
> is used by Wikipedia for Assyrian Neo-Aramaic ("aii" with macrolanguage
> "syr" in 639-3). I'm sure there are more oddities.

I'm sure there are too. Which is why I think it would help the
WikiMedia Foundation Language Committee a lot if someone with
more insight and experience than me were to take some time to give
them more guidance about this.

  --Mike

-- 
Michael(tm) Smith
http://people.w3.org/mike/


More information about the Ietf-languages mailing list