RFC 3066bis: Philosophical objection (harsh)

Harald Tveit Alvestrand harald at alvestrand.no
Tue Dec 9 07:20:52 CET 2003


I do NOT agree with using a liberal generative syntax for generating 
language tags. I believe we should stick with whole-tag registration, and 
stick to simple rules and guidelines for them, aimed at having, as far as 
possible, only useful tags.


I think the use of language tags where the sender is free to choose from 
multiple rules and generate subtags at will is harmful to interoperability 
and harmful to the end-user.

I believe that the job of the language tags is to register all variants for 
which there is a known need for making the distinction between the various 
forms in the form of a language tag, and where there is a real reason why 
more powerful means of expressing the user's preferences or the properties 
of data are not appropriate.
Therefore, a system with fewer language tags is better than one with more 
language tags.

I think, in particular, that:

- productive use of script codes hurts the current use of language tags, 
creates potential for harmful confusion for the users, and is therefore a 
Bad Idea.
Requiring recipients to match en-Latn-US to en-US is wrong.

- the productive use of years is a dangerous source of confusion, and that 
year markings without an IANA registration to point out what they are 
supposed to mean is making things easy for a sender at the expense of the 
recipient - something that is not a reasonable tradeoff.
Requiring recipients to know whether de-1900 and de-1905 can be considered 
equal or not, with no further publicly available information, is wrong.

- the use of unregistered, undefined name-value pairs in the extension 
subtag is a dangerously complex and noninteroperable solution to a still 
unidentified problem, and further harms interoperability with systems that 
depend on the non-occurence of the = character inside language tags.
Requiring users who have written code to parse "lang=en" to also parse 
"lang=en-Latn-US-x-undefined=even%20more%20undefined" is wrong.

Having thus harshly denounced 90% of the ideas in this document, I'll end 
this note with a saving grace:

I think the idea of the -x- subtag for separating a registered tag from 
unregistered variants makes sense under the current rules for Accept-Range, 
and should be adopted.


More information about the Ietf-languages mailing list