<br><font size=2 face="sans-serif">The "zxx" tag started with
my query into how I should classify the "audio content" of a
silent film in a system designed to serve non-silent films where a language
code is required. Peter suggested "zxx = no linguistic content"
and registered it. </font>
<br>
<br><font size=2 face="sans-serif">I felt that it might be better to use
the industry terminology "silent" and employ a free tag in the
"Q" space of ISO 639-2. While there was "no linguistic content"
on that audio channel, there was certainly a plot that could be determined
from watching the film even if the title cards were removed (a "title
card" is an interstitial used to display the text in a silent film).
To describe our wonderful heritage of silent films as having no linguistic
content just seemed a bit cruel. I was willing to go with "not applicable"
but could not recommend the use of "zxx = no linguistic content"
for this purpose.</font>
<br>
<br><font size=2 face="sans-serif">When it was later suggested that "zxx"
should be used to mark up code fragments appearing in a tutorial written
in English, I was even more opposed to the "non-linguistic" semantic.
I wasn't the only one who complained that code -- especially in the context
of a technical tutorial -- is primarily meant to be read by humans, not
machines. An assistive device such as a Braille screenreader would want
to represent that text as language, not skip over it because it's non-linguistic
in nature. Binary junk data is the only thing I can think of that is truly
non-linguistic.</font>
<br>
<br><font size=2 face="sans-serif">Any chance we could broaden the semantic
of the "zxx" tag? I still think we did the wrong thing here and
the "non-applicable" tag is more appropriate for all the use
cases mentioned.</font>
<br>
<br><font size=2 face="sans-serif"> http://lists.w3.org/Archives/Public/www-international/2007AprJun/0187.html
-- one previous post on the topic</font>
<br>
<br><font size=2 face="sans-serif">Side note: I find the IETF archives
very hard to search or I could have produced a better example. Am I missing
a search interface somewhere? (Reply offlist.)</font>
<br>
<br><font size=2 face="sans-serif">Regards,</font>
<br>
<br><font size=2 face="sans-serif">Karen Broome</font>
<br>
<br><tt><font size=2>Peter Constable <petercon@microsoft.com> wrote
on 03/14/2008 01:37:30 PM:<br>
<br>
> If “zxx” were “not applicable”, I would not have any reservation
<br>
> about semantic overloading for the application scenarios I have in
<br>
> mind now. Funny, I really have no recollection of you suggesting <br>
> that at that time. (Sorry.)</font></tt>
<br><tt><font size=2>> </font></tt>
<br><tt><font size=2>> </font></tt>
<br><tt><font size=2>> Peter</font></tt>
<br><tt><font size=2>> </font></tt>
<br><tt><font size=2>> From: Karen_Broome@spe.sony.com [mailto:Karen_Broome@spe.sony.com]
<br>
> Sent: Friday, March 14, 2008 12:51 PM<br>
> To: Peter Constable<br>
> Cc: ietf-languages@iana.org<br>
> Subject: RE: ID for language-invariant strings</font></tt>
<br><tt><font size=2>> </font></tt>
<br><tt><font size=2>> <br>
> I can keep restating the point I've made from the beginning. The <br>
> semantic for "zxx" should have been defined as "not
applicable" <br>
> which was the use case presented at the time it was created. Since
<br>
> it was not expressed in this way, now we need another tag, I think.
<br>
> <br>
> Regards, <br>
> <br>
> Karen Broome<br>
> Metadata Systems Designer<br>
> Sony Pictures Entertainment<br>
> 310.244.4384 <br>
> <br>
> ietf-languages-bounces@alvestrand.no wrote on 03/14/2008 08:49:31
AM:<br>
> <br>
> > > From: ietf-languages-bounces@alvestrand.no [mailto:ietf-languages-<br>
> > > bounces@alvestrand.no] On Behalf Of Doug Ewell<br>
> > > Sent: Thursday, March 13, 2008 11:16 PM<br>
> > > To: ietf-languages@iana.org<br>
> > > Subject: Re: ID for language-invariant strings<br>
> > <br>
> > > ["zxx" is] a "less bad" fit than the
other choices:<br>
> > ><br>
> > > zxx - content is not linguistic in nature<br>
> > > und - content is in an undetermined language<br>
> > > mis - content is in an otherwise uncoded language<br>
> > > i-default - content is in a default, fallback language intelligible
to<br>
> > > anglophones<br>
> > ><br>
> > > I agree that inventing a new code element/subtag for this
situation<br>
> > > would be undesirable.<br>
> > <br>
> > If it's less bad, I still think it kind of bad.<br>
> > <br>
> > For instance, suppose I need to apply language tags to each of
the <br>
> > data elements in the main ISO 639-3 code table. For data in columns
<br>
> > like the 639-3 ID, clearly "zxx" applies: the alpha-3
identifiers <br>
> > have no linguistic content. But what about the reference names?
<br>
> > "zxx" would be a decidedly bad choice for that column,
IMO, since <br>
> > every single data element is definitely linguistic in nature.<br>
> > <br>
> > I don't know why people are so adverse to new special-purpose
code <br>
> > elements when there is a reasonable need. It's not like there
are a <br>
> > lot of different special-case semantics that are needed in language-<br>
> > tagging application scenarios; I think the set is very small,
<br>
> > perhaps even that this is the only important gap. I am *far*
more <br>
> > concerned about overloading tags with distinct, orthogonal semantics<br>
> > for particular application scenarios ("und" means X
in this <br>
> > application but Y in that application): *that* can lead to serious
trouble.<br>
> > <br>
> > As I think about this, I'm inclined to propose a new special-purpose<br>
> > ID "zrf" in ISO 639:<br>
> > <br>
> > ID: zxn<br>
> > Reference name: language-neutral content<br>
> > Comment: This ID is provided primarily for application scenarios<br>
> > in which a language identifier
must be declared for<br>
> > content that may be linguistic
in nature but that is<br>
> > used as a language-neutral
identifier to reference or<br>
> > index other information objects.<br>
> > <br>
> > Uses of this code element do
not make any declaration<br>
> > regarding the actual language
of a given data element<br>
> > or of whether a given data
element is, in fact,<br>
> > linguistic in nature.<br>
> > <br>
> > Note: for applications scenarios
in which an identifier<br>
> > string is unambiguously non-linguistic
in nature, "zxx"<br>
> > should be used rather than
"zxn".<br>
> > <br>
> > For example, in a database
of coding elements for<br>
> > cultural objects that includes
for each such object a<br>
> > code element such as an alpha-3
string (e.g., "abc")<br>
> > and a reference name (e.g.,
"PIANO", "GUQIN"), the<br>
> > language identifier applied
to the code element<br>
> > should be "zxx",but
"zxn" may be applied to the<br>
> > reference names.<br>
> > <br>
> > Applications may also use "zxn"
for content that is<br>
> > Linguistic in nature but that
is represented in a<br>
> > Language-neutral form. For
example, the concept 'ten'<br>
> > Is linguistic in nature but
can be expressed in the<br>
> > Language-neutral form "10".
Such use of "zxn" should<br>
> > be considered only for application
scenarios that<br>
> > have a particular need; this
usage is not recommended<br>
> > in general. For instance, if
a software application<br>
> > needs to segment the strings
in a document into items<br>
> > that get passed to various
language-specific processes<br>
> > and it must apply a language
identifier to language-<br>
> > neutral content such as numbers
represented as digits,<br>
> > then "zxn" may be
used within that application; but it<br>
> > is not expected that content
authors would apply "zxn"<br>
> > to numbers in their documents
in general.<br>
> > <br>
> > <br>
> > <br>
> > Peter<br>
> > _______________________________________________<br>
> > Ietf-languages mailing list<br>
> > Ietf-languages@alvestrand.no<br>
> > http://www.alvestrand.no/mailman/listinfo/ietf-languages</font></tt>