deprecating www as language code

"Martin J. Dürst" duerst at it.aoyama.ac.jp
Fri Apr 8 10:59:50 CEST 2011


On 2011/04/08 16:03, Stephane Bortzmeyer wrote:
> On Thu, Apr 07, 2011 at 03:23:51PM -0500,
>   ISO639-3<iso639-3 at sil.org>  wrote
>   a message of 21 lines which said:
>
>> As more and more websites use the ISO 639 language codes, there are
>> some problems with certain codes when they are used in URLs. For
>> example, the code 'www', which is used for the Wawa language of
>> Cameroon, is problematic because when it is used as subdomain, it
>> obviously conflicts with the root domain.
>
> I understand the rationale (although the root cause is that HTTP,
> stupidly, does not use SRV records and instead rely on the 'www'
> convention),

Well, there is nothing in HTTP itself that adds the 'www', or that says 
that a domain name for a Web site has to start with www (many don't).

<technical excursion>
Some (if not most) browsers prefix a 'www' to a domain name if they 
can't find an HTTP server at the original domain name (this, as well as 
the default postfix(es), such as .com, are usually configurable). Also, 
many servers redirect a request to a 'naked' domain name to one with 
'www' prefixed (e.g. amazon.com -> www.amazon.com), but that's purely 
the server's choice.

Purely using a 'naked' domain name for Web traffic may be a bit more 
complex to set up depending on what else that naked domain is also 
supposed to serve, but in low volume situations, everything is on the 
same machine anyway, and in high volume situations, load balancers and 
stuff are needed anyway. So SRV records would be cleaner technically, 
but the main reason 'www' is so widespread is that it helps *people* 
understand it's a Web site.
</technical excursion>


> but remember that, even if ISO 639 changes the code, RFC
> 5646 rules (section 3.4) will still make the code live forever (even
> if it is with "Deprecated: 2011-04-08" and "Preferred-Value: wwx").

Yes. Please also remember that there are Web sites that serve 
(more-or-less) parallel documents in different languages. In some 
setups, these languages are distinguished by an extension, e.g. .pl for 
Polish. It so happens that .pl also stands for Perl, a frequently used 
programming language on Web sites. Yet nobody (at least not that I know) 
has yet asked for Polish to be assigned a different two-letter code. 
Server admins either use a different setup or use a different extension 
for one or the other language.

As for Wikipedia, claiming that the code 'www' for Wawa is a problem 
seems to just try to dump the problem on somebody else. There are at 
least two solutions for Wikipedia that don't require a code change:

1) Use www.wikipedia.org for Wawa, and wikipedia.org for the main site. 
On www.wikipedia.org, add a link saying that the main site is at 
wikipedia.org. The problem with this solution may be that there are tons 
of links out there to www.wikipedia.org (or some pages therein), which 
would change.

2) Choose a domain prefix for Wawa as wikipedia pleases. E.g. wawa. The 
Wawa site would then be wawa.wikipedia.org. I guess it would be way 
easier to understand for most people. The problem with this is that the 
prefix<->language code correspondence may have been hardcoded on the 
Wikipedia servers, but that can be fixed rather easily by a decent 
programmer.

So my conclusion would be to tell Wikipedia to get their act together 
and choose something like wawa.wikipedia.org for wawa.

This is highly preferable to changing language codes, because it avoids 
any followup requests (www is by way the most famous domain name prefix, 
but there are others in wide or not so wide use that may clash with 
language codes).


Regards,    Martin.


More information about the Ietf-languages mailing list