Doug Ewell dewell at
Sun May 2 23:32:37 CEST 2004

John Hudson <tiro at tiro dot com> wrote:

> In the code lists at the 4-letter
> script codes are shown capitalised, e.g. Arab not arab, Armn not armn,
etc.. Is this
> intentional? Should the codes always be capitalised? Does it matter if
they are not?

The FDIS from February 2003 states that "The four-letter codes SHALL be
written with an initial capital Latin letter and final small Latin
letters" (emphasis mine).

Although this is a useful convention and aids readability, I would
suggest -- without any authority -- that it is only a convention and not
an absolute requirement for use of the codes, just as the analogous
conventions to express ISO 639 and ISO 3166 codes in all-lowercase and
all-uppercase, respectively, are just conventions.

In particular, both RFC 3066 and its (in-progress) successor state that
language tags and subtags "are to be treated as case insensitive."  And,
with apologies for bringing up a sore point for many, Unicode recommends
converting all language tags, including subtags, to lowercase before
encoding them as Plane 14 language tag characters (TUS 4.0, section
15.10, page 405).

ISO 15924 alpha-4 codes are already distinguishable from ISO 639 and ISO
3166 codes, simply by virtue of being four letters long.

Michael, please let me know if I am off base on this.

-Doug Ewell
 Fullerton, California

