deprecating www as language code
"Martin J. Dürst"
duerst at it.aoyama.ac.jp
Fri Apr 8 10:59:50 CEST 2011
On 2011/04/08 16:03, Stephane Bortzmeyer wrote:
> On Thu, Apr 07, 2011 at 03:23:51PM -0500,
> ISO639-3<iso639-3 at sil.org> wrote
> a message of 21 lines which said:
>
>> As more and more websites use the ISO 639 language codes, there are
>> some problems with certain codes when they are used in URLs. For
>> example, the code 'www', which is used for the Wawa language of
>> Cameroon, is problematic because when it is used as subdomain, it
>> obviously conflicts with the root domain.
>
> I understand the rationale (although the root cause is that HTTP,
> stupidly, does not use SRV records and instead rely on the 'www'
> convention),
Well, there is nothing in HTTP itself that adds the 'www', or that says
that a domain name for a Web site has to start with www (many don't).
<technical excursion>
Some (if not most) browsers prefix a 'www' to a domain name if they
can't find an HTTP server at the original domain name (this, as well as
the default postfix(es), such as .com, are usually configurable). Also,
many servers redirect a request to a 'naked' domain name to one with
'www' prefixed (e.g. amazon.com -> www.amazon.com), but that's purely
the server's choice.
Purely using a 'naked' domain name for Web traffic may be a bit more
complex to set up depending on what else that naked domain is also
supposed to serve, but in low volume situations, everything is on the
same machine anyway, and in high volume situations, load balancers and
stuff are needed anyway. So SRV records would be cleaner technically,
but the main reason 'www' is so widespread is that it helps *people*
understand it's a Web site.
</technical excursion>
> but remember that, even if ISO 639 changes the code, RFC
> 5646 rules (section 3.4) will still make the code live forever (even
> if it is with "Deprecated: 2011-04-08" and "Preferred-Value: wwx").
Yes. Please also remember that there are Web sites that serve
(more-or-less) parallel documents in different languages. In some
setups, these languages are distinguished by an extension, e.g. .pl for
Polish. It so happens that .pl also stands for Perl, a frequently used
programming language on Web sites. Yet nobody (at least not that I know)
has yet asked for Polish to be assigned a different two-letter code.
Server admins either use a different setup or use a different extension
for one or the other language.
As for Wikipedia, claiming that the code 'www' for Wawa is a problem
seems to just try to dump the problem on somebody else. There are at
least two solutions for Wikipedia that don't require a code change:
1) Use www.wikipedia.org for Wawa, and wikipedia.org for the main site.
On www.wikipedia.org, add a link saying that the main site is at
wikipedia.org. The problem with this solution may be that there are tons
of links out there to www.wikipedia.org (or some pages therein), which
would change.
2) Choose a domain prefix for Wawa as wikipedia pleases. E.g. wawa. The
Wawa site would then be wawa.wikipedia.org. I guess it would be way
easier to understand for most people. The problem with this is that the
prefix<->language code correspondence may have been hardcoded on the
Wikipedia servers, but that can be fixed rather easily by a decent
programmer.
So my conclusion would be to tell Wikipedia to get their act together
and choose something like wawa.wikipedia.org for wawa.
This is highly preferable to changing language codes, because it avoids
any followup requests (www is by way the most famous domain name prefix,
but there are others in wide or not so wide use that may clash with
language codes).
Regards, Martin.
More information about the Ietf-languages
mailing list