Script codes in RFC 3066

Martin Duerst duerst at
Thu Apr 10 17:03:23 CEST 2003

At 04:10 03/04/10 -0400, Tex Texin wrote:

>Martin Duerst wrote:
> >    One particular concern I have is that once there is a productive
> >    pattern, the assumption that all the slots have to be filled in
> >    seems to spread in an uncontrolled way. I have seen numerous examples
> >    of tags such as 'ja-jp', which in particular as far as language goes,
> >    doesn't give more information than simply 'ja'.
>While true, it doesn't hurt

That by itself may not hurt too much. But I have seen a lot of
other combinations that are utter nonsense, and questions from
people that seem to suggest that they somehow believe that
there is something serious behind this nonsense.

>and provides some protection in case Japanese
>becomes used heavily in another region.

Let's start to worry about that when that happens.

> >    Another point is that while something like az-latn/az-Cryl is very
> >    good for language negotiation (e.g. HTTP Accept-Language/
> >    Content-Language headers), it is really enough to mark up the
> >    actual text (e.g. with xml:lang) with 'az' only, because the
> >    script is self-evident from the characters used.
>Although true, why require the script to be examined to make decisions about
>the thing being tagged?

Because, as I explained in another message, examining the actual
characters is by magnitudes more reliable.

>I could be filtering the text, routing the text, or performing a number of
>different operations based on the language-script...

I included 'Content-Language', which means that I agree that
external tagging can make sense, to avoid having to look inside
a document. But once you are inside (xml:lang), it doesn't
make sense anymore.

>I think a policy of being as specific as possible when tagging makes sense.

I strongly disagree. If we ever get to update RFC 3066, and this includes
script tags, then we clearly need to say that script tags only should
be used to indicate an unusual script, and only where the script is
otherwise not easily derivable.

Regards,   Martin.

More information about the Ietf-languages mailing list