Alemanic & Swiss German

Mark Davis mark.davis at
Wed Dec 6 07:21:41 CET 2006

>  In a presentation of Google it
> was suggested that the coding of content with language codes is so
> unreliable that it is practically useless.

I suspect that this was an impression left by my presentation at the Unicode
conference. It is true that for web pages, the language tagging is pretty
minimal (about 15%) and too often incorrect to be relied upon. However, that
is far from saying that BCP 47 (RFC 4646) is useless. It provides a stable,
unambiguous, identification system for communicating language information
between software components. Even with web pages, once the language of a web
page is heuristically determined (and any existing tag can help to break
ties), the language tag is used internally to communicate with any process
that needs to deal with that page. And there are many other uses of language
tags -- communicating the user's choice of UI language is an obvious one.

The key issue for web pages in particular is that their producers don't
immediately see much value in accurate tagging, because the consequences of
omission are not immediately apparent, and at this point at least, not that


On 12/5/06, Doug Ewell <dewell at> wrote:
> Gerard Meijssen <gerardm at wiktionaryz dot org> wrote:
> > From my perspective the RFC 4646 that is seemingly inevitable is
> > problematic because it only addresses on how it wants to be backwards
> > compatible and is willing to sacrifice the easy understanding that a
> > single list would bring with a hybrid system. A hybrid system where it
> > is unclear to me how they want to link it to the ISO-639-3 content
> > with the argument that they are not willing to address it until the
> > standard is standard. From my perspective, when the ISO-639-3 is
> > finally ratified, this list of how to link needs to be there in a
> > finished form. By not having it at the time of ratification, it makes
> > the IANA codes less then credible. By not having a period where these
> > codes can be discussed, you will not have buy in.
> I need help understanding this:
> 1.  The "backward compatibility" of RFC 4646 and 4646bis is a major
> design goal.  There are many systems that use RFC 3066 tags and breaking
> compatibility with it, in particular by replacing 2-letter subtags such
> as "nl" with the 3 letter-equivalent "nld", is a non-starter.
> 2.  No RFC that references a draft standard or RFC can be approved and
> published.  The referenced standard must be an official standard.  That
> is one of the rules of the game.  ISO 639-3 is not yet an approved,
> official standard, therefore draft-4646bis is not yet eligible to become
> an RFC.  But we have certainly "addressed" ISO 639-3 and discussed it
> and its code elements in detail.  I don't see what the objection is.
> 3.  It should be very clear, from reading draft-4646bis, how it intends
> to incorporate the ISO 639-3 code elements.
> > Yes, you cannot have it both ways. :(  In a presentation of Google it
> > was suggested that the coding of content with language codes is so
> > unreliable that it is practically useless. This seems to suggest to me
> > that good marketing for the codes and clear benefits for using correct
> > codes is needed. To me this lack of the effectiveness of these codes
> > and the lack of good marketing makes the whole argument for backwards
> > compatibility increasingly weak.
> Many people are not using RFC 3066 correctly, therefore we should
> abandon backward compatibility and punish those who are using it
> correctly?
> > Making nationalistic issues the primary argument for what makes a
> > language ignores that many languages are spoken in many countries
> > which refutes the argument that it is for the single countries
> > involved to be the sole judge to decide on such languages.
> Is there a commentary on RFC 4646 or RFC 4646bis here?
> --
> Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at
-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Ietf-languages mailing list