LANGUAGE SUBTAG MODIFICATION REQUEST - zxx
doug at ewellic.org
Sat Apr 26 04:03:13 CEST 2008
CE Whitehead <cewcathar at hotmail dot com> wrote:
> Writing or not, the ASCII characters in programmatic languages are a
> subset of Latin-1, and are used to encode programming keywords based
> on English terms (while, for, if, then, else) as well as a few
> mathematical symbols. Perhaps there needs to be a comment somewhere
> stating that [Latn] is not to be used to identify ASCII character
> encoding--perhaps a comments field added to the [Latn] subtag??
Addison's comment, about not tagging ASCII-encoded data as if it were
Latin-script linguistic content, reflects a general principle of
language tagging -- use only those subtags that are needed to indicate
the nature of the content. This principle does not need to be captured
in comments in the Registry.
The "problem" such a comment would "solve" is not limited to the Latin
script. Programming languages (such as APL) and transfer encodings
(such as Markus Scherer's "base16k" ) that predominantly use
characters outside the Basic Latin range do exist. But it would be
silly to tag APL source code as "zxx-Zsym" or base16k-encoded data as
"zxx-Hani". And more to the point, it would be silly to load up the
Registry with warnings not to do silly things:
Comments: Don't use this subtag for the reversed R in "Toys 'R' Us."
This is not about assisting users in the use of a particular script
subtag. It is about teaching users how to apply BCP 47, and for that,
they really need to read BCP 47. Yes, I know it's long -- very long --
and that is why I also suggest that articles be written on langtag.net
that explain and paraphrase BCP 47. But the Registry is not the place
> (Do not consider this request too seriously
Sorry, if it's a request to put things in the Registry that don't belong
there, I'm going to take it seriously.
Doug Ewell * Arvada, Colorado, USA * RFC 4645 * UTN #14
More information about the Ietf-languages