Alemanic & Swiss German
prosfilaes at gmail.com
Wed Dec 6 11:32:07 CET 2006
On 12/6/06, Gerard Meijssen <gerardm at wiktionaryz.org> wrote:
> Please understand, I am speaking at this moment very much from outside
> the IETF and have been clear that I speak from *my *perspective. The
> Google presentation gives you a reality check; only 15% of the content
> is tagged and often incorrectly. This means to me that the codes are not
> understood / adhered to.
15% of what content? Web content? Web content is a unmitigated mess
created by a huge mass of people, most of whom are complete amateurs.
No standard other than IE _and_ Netscape both refusing to render the
document is generally accepted across the web. There are many other
uses by many other users, which are much more careful and responsible
about their tagging.
There's no standard that going to search out all those people who tag
Japanese as jp or jpn or, more clearly clueless about the standard,
Japanese or Nippon, and get them to fix their ways. Breaking the
standard in an non-backward compatible way is only going to change
the behavior of the people who were correct in the first place.
> Well, I can remember discussions where people insist zh being a language
> while it clearly is not from a linguistic point of view.
zh is Chinese. Chinese describes a language to most people. The
problem is in the latter, not the former.
> When you only work inside what the Standard supports there is no
> apparent problem. The problems starts when a Standard does not support a
RFC 4646 has great procedures to register new languages, that involve
posting a message to ietf-languages. Since you've never tried to
register a new language on ietf-languages, I'm a little skeptical of
your opinion that there's a problem. Furthermore, RFC 4646 offers a
wide array of private use options that I suspect will suffice in most
Project Gutenberg uses ISO-639-2 tags in our bibliographic catalog,
and manages to use them properly. One of the advantages of RFC 4646 is
that it would be trivial to start using RFC 4646 tags if we felt it
necessary. However, out of approximately two thousand volumes, there
are two that we don't have categorized down to a language: a book with
Mutsun texts, and a book in Quiché. For our purposes, a tenth of a
percent of our volumes roughly categorized is good enough. However,
were we to be concerned, RFC 4646 gives us the option of labeling the
Mutsun text as as nai-x-mutsun, or applying to make nai-mutsun legal,
which I'm sure Everson would approve in a couple days.
More information about the Ietf-languages