Doug Ewell doug at
Sun Apr 27 20:30:15 CEST 2008

CE Whitehead <cewcathar at hotmail dot com> wrote:

> Agreed, use only those subtags that are needed to indicate the nature 
> of the content; but someone might think that the script used to write 
> some programmatic language (such as JavaScript, C++, 
> etc)--mathematical symbols + Latin--should indicated--that is, is part 
> of the nature of the content.

What good would it do to group JavaScript and C++ (and Lisp and COBOL) 
together by tagging them all "zxx-Latn", and differentiate them from APL 
by tagging the latter "zxx-Zsym"?

The nature of the content of source code is the programming language in 
which it is written.  If you feel this needs to be captured in a BCP 47 
language tag, some possibilities are:


Note that the last choice involves the extension mechanism, which 
involves a lot of up-front work and which nobody seems to like.

> To the extent that there are programmatic languages written using 
> characters outside the Latin range, then it's even more worthwhile to 
> tag the encoding--although I grant that this can be done with a 
> declaration of encoding.

As a reminder, BCP 47 language tags have nothing to do with encoding.  I 
assume you mean "character repertoire," but script subtags make no 
promises about the character repertoire either; see RFC 4646, Section 6, 
paragraph 4.

Doug Ewell  *  Arvada, Colorado, USA  *  RFC 4645  *  UTN #14  ˆ

More information about the Ietf-languages mailing list