As I said, it entirely depends on what the application is. <br><ul><li>For a language-neutral (human-visible fallback) form, "und" is appropriate.</li><li>For a non-language form (part number, code), "zxx" is appropriate.</li>
</ul>I think people need to decide which of these they are looking for, and use the appropriate one.<br><br>Mark<br><br><div class="gmail_quote">On Mon, Mar 17, 2008 at 6:16 PM, Peter Constable <<a href="mailto:petercon@microsoft.com">petercon@microsoft.com</a>> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div link="blue" vlink="purple" lang="EN-US">
<div>
<p><span style="font-size: 11pt; color: rgb(31, 73, 125);">If you wouldn't mind, I'll let you go ahead.</span></p>
<p><span style="font-size: 11pt; color: rgb(31, 73, 125);"> </span></p>
<p><span style="font-size: 11pt; color: rgb(31, 73, 125);">Now, suppose we make that change, where does it leave us wrt the
application scenario I've brought up? Mark, would you switch from "und" to "zxx"
with this adjusted semantic?</span></p>
<p><span style="font-size: 11pt; color: rgb(31, 73, 125);"> </span></p>
<p><span style="font-size: 11pt; color: rgb(31, 73, 125);"> </span></p>
<p><span style="font-size: 11pt; color: rgb(31, 73, 125);">Peter</span></p>
<p><span style="font-size: 11pt; color: rgb(31, 73, 125);"> </span></p>
<div style="border-style: none none none solid; border-color: -moz-use-text-color -moz-use-text-color -moz-use-text-color blue; border-width: medium medium medium 1.5pt; padding: 0in 0in 0in 4pt;">
<div>
<div style="border-style: solid none none; border-color: rgb(181, 196, 223) -moz-use-text-color -moz-use-text-color; border-width: 1pt medium medium; padding: 3pt 0in 0in;">
<p><b><span style="font-size: 10pt;">From:</span></b><span style="font-size: 10pt;">
<a href="mailto:Karen_Broome@spe.sony.com" target="_blank">Karen_Broome@spe.sony.com</a> [mailto:<a href="mailto:Karen_Broome@spe.sony.com" target="_blank">Karen_Broome@spe.sony.com</a>] <br>
<b>Sent:</b> Monday, March 17, 2008 5:56 PM<br>
<b>To:</b> Mark Davis<br>
<b>Cc:</b> <a href="mailto:ietf-languages@iana.org" target="_blank">ietf-languages@iana.org</a>; <a href="mailto:mark.edward.davis@gmail.com" target="_blank">mark.edward.davis@gmail.com</a>; Peter
Constable<div><div></div><div class="Wj3C7c"><br>
<b>Subject:</b> Re: ID for language-invariant strings</div></div></span></p>
</div>
</div><div><div></div><div class="Wj3C7c">
<p> </p>
<p style="margin-bottom: 12pt;"><br>
<span style="font-size: 10pt;">I'm happy to
submit the change request unless Peter wants to handle it (as the tag
originator). </span><br>
<span style="font-size: 10pt;"><br>
Regards,</span> <br>
<br>
<span style="font-size: 10pt;">Karen Broome</span>
<br>
<br>
<br>
</p>
<table style="width: 100%;" border="0" cellpadding="0" width="100%">
<tbody><tr>
<td style="padding: 0.75pt; width: 40%;" valign="top" width="40%">
<p><b><span style="font-size: 7.5pt;">"Mark
Davis" <<a href="mailto:mark.davis@icu-project.org" target="_blank">mark.davis@icu-project.org</a>></span></b><span style="font-size: 7.5pt;"> </span><br>
<span style="font-size: 7.5pt;">Sent by:
<a href="mailto:mark.edward.davis@gmail.com" target="_blank">mark.edward.davis@gmail.com</a></span> </p>
<p><span style="font-size: 7.5pt;">03/17/2008
05:37 PM</span> </p>
</td>
<td style="padding: 0.75pt; width: 59%;" valign="top" width="59%">
<table style="width: 100%;" border="0" cellpadding="0" width="100%">
<tbody><tr>
<td style="padding: 0.75pt;" valign="top">
<p style="text-align: right;" align="right"><span style="font-size: 7.5pt;">To</span></p>
</td>
<td style="padding: 0.75pt;" valign="top">
<p><span style="font-size: 7.5pt;">"Peter
Constable" <<a href="mailto:petercon@microsoft.com" target="_blank">petercon@microsoft.com</a>></span> </p>
</td>
</tr>
<tr>
<td style="padding: 0.75pt;" valign="top">
<p style="text-align: right;" align="right"><span style="font-size: 7.5pt;">cc</span></p>
</td>
<td style="padding: 0.75pt;" valign="top">
<p><span style="font-size: 7.5pt;">"<a href="mailto:ietf-languages@iana.org" target="_blank">ietf-languages@iana.org</a>"
<<a href="mailto:ietf-languages@iana.org" target="_blank">ietf-languages@iana.org</a>>, "<a href="mailto:Karen_Broome@spe.sony.com" target="_blank">Karen_Broome@spe.sony.com</a>"
<<a href="mailto:Karen_Broome@spe.sony.com" target="_blank">Karen_Broome@spe.sony.com</a>></span> </p>
</td>
</tr>
<tr>
<td style="padding: 0.75pt;" valign="top">
<p style="text-align: right;" align="right"><span style="font-size: 7.5pt;">Subject</span></p>
</td>
<td style="padding: 0.75pt;" valign="top">
<p><span style="font-size: 7.5pt;">Re:
ID for language-invariant strings</span></p>
</td>
</tr>
</tbody></table>
<p> </p>
<table border="0" cellpadding="0">
<tbody><tr>
<td style="padding: 0.75pt;" valign="top"></td>
<td style="padding: 0.75pt;" valign="top"></td>
</tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<p><br>
<br>
<br>
I think that would be a reasonable change.<br>
<br>
Mark<br>
<br>
On Mon, Mar 17, 2008 at 5:05 PM, Peter Constable <<a href="mailto:petercon@microsoft.com" target="_blank">petercon@microsoft.com</a>> wrote: <br>
<span style="font-size: 10pt; color: rgb(31, 73, 125);">It seems to me that changing from
"no linguistic content" to "not applicable" isn't a huge
degree of broadening, and broadening is not prohibited. So, if you wanted to
push for broadening, that might be possible. But I think there should be some
consensus here before taking it to the JAC.</span> </p>
<p><span style="font-size: 10pt; color: rgb(31, 73, 125);"> </span> </p>
<p><span style="font-size: 10pt; color: rgb(31, 73, 125);">Peter</span> </p>
<p><span style="font-size: 10pt; color: rgb(31, 73, 125);"> </span> </p>
<p><b><span style="font-size: 10pt;">From:</span></b><span style="font-size: 10pt;"> </span><a href="mailto:ietf-languages-bounces@alvestrand.no" target="_blank"><span style="font-size: 10pt;">ietf-languages-bounces@alvestrand.no</span></a><span style="font-size: 10pt;"> [mailto:</span><a href="mailto:ietf-languages-bounces@alvestrand.no" target="_blank"><span style="font-size: 10pt;">ietf-languages-bounces@alvestrand.no</span></a><span style="font-size: 10pt;">] <b>On Behalf Of </b>Peter Constable<b><br>
Sent:</b> Monday, March 17, 2008 3:26 PM<b><br>
To:</b> </span><a href="mailto:Karen_Broome@spe.sony.com" target="_blank"><span style="font-size: 10pt;">Karen_Broome@spe.sony.com</span></a> </p>
<p><b><span style="font-size: 10pt;"><br>
Cc:</span></b><span style="font-size: 10pt;"> </span><a href="mailto:ietf-languages@iana.org" target="_blank"><span style="font-size: 10pt;">ietf-languages@iana.org</span></a><b><span style="font-size: 10pt;"><br>
Subject:</span></b><span style="font-size: 10pt;"> RE: ID for
language-invariant strings</span> </p>
<p> </p>
<p><span style="font-size: 10pt; color: rgb(31, 73, 125);">Karen: I suggested "no
linguistic content" on the understanding that the audio and subtitle
streams were all tagged separately, and that it would be an audio stream about
which was declared "no linguistic content", not the film as a whole.</span>
</p>
<p><span style="font-size: 10pt; color: rgb(31, 73, 125);"> </span> </p>
<p><span style="font-size: 10pt; color: rgb(31, 73, 125);"> </span> </p>
<p><span style="font-size: 10pt; color: rgb(31, 73, 125);">Peter</span> </p>
<p><span style="font-size: 10pt; color: rgb(31, 73, 125);"> </span> </p>
<p><b><span style="font-size: 10pt;">From:</span></b><span style="font-size: 10pt;"> </span><a href="mailto:Karen_Broome@spe.sony.com" target="_blank"><span style="font-size: 10pt;">Karen_Broome@spe.sony.com</span></a><span style="font-size: 10pt;"> [mailto:</span><a href="mailto:Karen_Broome@spe.sony.com" target="_blank"><span style="font-size: 10pt;">Karen_Broome@spe.sony.com</span></a><span style="font-size: 10pt;">] <b><br>
Sent:</b> Monday, March 17, 2008 2:25 PM<b><br>
To:</b> Peter Constable<b><br>
Cc:</b> </span><a href="mailto:ietf-languages@iana.org" target="_blank"><span style="font-size: 10pt;">ietf-languages@iana.org</span></a><b><span style="font-size: 10pt;"><br>
Subject:</span></b><span style="font-size: 10pt;"> RE: ID for
language-invariant strings</span> </p>
<p> </p>
<p><span style="font-size: 10pt;"><br>
The "zxx" tag started with my query into how I should classify the
"audio content" of a silent film in a system designed to serve
non-silent films where a language code is required. Peter suggested "zxx =
no linguistic content" and registered it. </span><br>
<span style="font-size: 10pt;"><br>
I felt that it might be better to use the industry terminology "silent"
and employ a free tag in the "Q" space of ISO 639-2. While there was
"no linguistic content" on that audio channel, there was certainly a
plot that could be determined from watching the film even if the title cards
were removed (a "title card" is an interstitial used to display the
text in a silent film). To describe our wonderful heritage of silent films as
having no linguistic content just seemed a bit cruel. I was willing to go with
"not applicable" but could not recommend the use of "zxx = no
linguistic content" for this purpose.</span> <br>
<span style="font-size: 10pt;"><br>
When it was later suggested that "zxx" should be used to mark up code
fragments appearing in a tutorial written in English, I was even more opposed
to the "non-linguistic" semantic. I wasn't the only one who
complained that code -- especially in the context of a technical tutorial -- is
primarily meant to be read by humans, not machines. An assistive device such as
a Braille screenreader would want to represent that text as language, not
skip over it because it's non-linguistic in nature. Binary junk data is the
only thing I can think of that is truly non-linguistic.</span> <br>
<span style="font-size: 10pt;"><br>
Any chance we could broaden the semantic of the "zxx" tag? I still
think we did the wrong thing here and the "non-applicable" tag is
more appropriate for all the use cases mentioned.</span> <br>
<span style="font-size: 10pt;"><br>
</span><a href="http://lists.w3.org/Archives/Public/www-international/2007AprJun/0187.html" target="_blank"><span style="font-size: 10pt;">http://lists.w3.org/Archives/Public/www-international/2007AprJun/0187.html</span></a><span style="font-size: 10pt;"> -- one previous post on the topic</span> <br>
<span style="font-size: 10pt;"><br>
Side note: I find the IETF archives very hard to search or I could have
produced a better example. Am I missing a search interface somewhere? (Reply
offlist.)</span> <br>
<span style="font-size: 10pt;"><br>
Regards,</span> <br>
<span style="font-size: 10pt;"><br>
Karen Broome</span> <br>
<span style="font-size: 10pt;"><br>
<tt>Peter Constable <</tt></span><a href="mailto:petercon@microsoft.com" target="_blank"><tt><span style="font-size: 10pt;">petercon@microsoft.com</span></tt></a><tt><span style="font-size: 10pt;">> wrote on 03/14/2008 01:37:30 PM:</span></tt><span style="font-size: 10pt;"><br>
</span><span style="font-size: 10pt;"><br>
<tt>> If "zxx" were "not applicable", I would not have
any reservation </tt><br>
<tt>> about semantic overloading for the application scenarios I have in </tt><br>
<tt>> mind now. Funny, I really have no recollection of you suggesting </tt><br>
<tt>> that at that time. (Sorry.)</tt></span> <span style="font-size: 10pt;"><br>
<tt>> </tt></span> <span style="font-size: 10pt;"><br>
<tt>> </tt></span> <span style="font-size: 10pt;"><br>
<tt>> Peter</tt></span> <span style="font-size: 10pt;"><br>
<tt>> </tt></span> <span style="font-size: 10pt;"><br>
<tt>> From: </tt></span><a href="mailto:Karen_Broome@spe.sony.com" target="_blank"><tt><span style="font-size: 10pt;">Karen_Broome@spe.sony.com</span></tt></a><tt><span style="font-size: 10pt;"> [mailto:</span></tt><a href="mailto:Karen_Broome@spe.sony.com" target="_blank"><tt><span style="font-size: 10pt;">Karen_Broome@spe.sony.com</span></tt></a><tt><span style="font-size: 10pt;">] </span></tt><span style="font-size: 10pt;"><br>
<tt>> Sent: Friday, March 14, 2008 12:51 PM</tt><br>
<tt>> To: Peter Constable</tt><br>
<tt>> Cc: </tt></span><a href="mailto:ietf-languages@iana.org" target="_blank"><tt><span style="font-size: 10pt;">ietf-languages@iana.org</span></tt></a><span style="font-size: 10pt;"><br>
<tt>> Subject: RE: ID for language-invariant strings</tt></span> <span style="font-size: 10pt;"><br>
<tt>> </tt></span> <span style="font-size: 10pt;"><br>
<tt>> </tt><br>
<tt>> I can keep restating the point I've made from the beginning. The </tt><br>
<tt>> semantic for "zxx" should have been defined as "not
applicable" </tt><br>
<tt>> which was the use case presented at the time it was created. Since </tt><br>
<tt>> it was not expressed in this way, now we need another tag, I think. </tt><br>
<tt>> </tt><br>
<tt>> Regards, </tt><br>
<tt>> </tt><br>
<tt>> Karen Broome</tt><br>
<tt>> Metadata Systems Designer</tt><br>
<tt>> Sony Pictures Entertainment</tt><br>
<tt>> 310.244.4384 </tt><br>
<tt>> </tt><br>
<tt>> </tt></span><a href="mailto:ietf-languages-bounces@alvestrand.no" target="_blank"><tt><span style="font-size: 10pt;">ietf-languages-bounces@alvestrand.no</span></tt></a><tt><span style="font-size: 10pt;"> wrote on 03/14/2008 08:49:31 AM:</span></tt><span style="font-size: 10pt;"><br>
<tt>> </tt><br>
<tt>> > > From: </tt></span><a href="mailto:ietf-languages-bounces@alvestrand.no" target="_blank"><tt><span style="font-size: 10pt;">ietf-languages-bounces@alvestrand.no</span></tt></a><tt><span style="font-size: 10pt;"> [mailto:</span></tt><a href="mailto:ietf-languages-" target="_blank"><tt><span style="font-size: 10pt;">ietf-languages-</span></tt></a><span style="font-size: 10pt;"><br>
<tt>> > > </tt></span><a href="mailto:bounces@alvestrand.no" target="_blank"><tt><span style="font-size: 10pt;">bounces@alvestrand.no</span></tt></a><tt><span style="font-size: 10pt;">] On Behalf Of Doug Ewell</span></tt><span style="font-size: 10pt;"><br>
<tt>> > > Sent: Thursday, March 13, 2008 11:16 PM</tt><br>
<tt>> > > To: </tt></span><a href="mailto:ietf-languages@iana.org" target="_blank"><tt><span style="font-size: 10pt;">ietf-languages@iana.org</span></tt></a><span style="font-size: 10pt;"><br>
<tt>> > > Subject: Re: ID for language-invariant strings</tt><br>
<tt>> > </tt><br>
<tt>> > > ["zxx" is] a "less bad" fit than the
other choices:</tt><br>
<tt>> > ></tt><br>
<tt>> > > zxx - content is not linguistic in nature</tt><br>
<tt>> > > und - content is in an undetermined language</tt><br>
<tt>> > > mis - content is in an otherwise uncoded language</tt><br>
<tt>> > > i-default - content is in a default, fallback language
intelligible to</tt><br>
<tt>> > > anglophones</tt><br>
<tt>> > ></tt><br>
<tt>> > > I agree that inventing a new code element/subtag for this
situation</tt><br>
<tt>> > > would be undesirable.</tt><br>
<tt>> > </tt><br>
<tt>> > If it's less bad, I still think it kind of bad.</tt><br>
<tt>> > </tt><br>
<tt>> > For instance, suppose I need to apply language tags to each of
the </tt><br>
<tt>> > data elements in the main ISO 639-3 code table. For data in
columns </tt><br>
<tt>> > like the 639-3 ID, clearly "zxx" applies: the alpha-3
identifiers </tt><br>
<tt>> > have no linguistic content. But what about the reference names? </tt><br>
<tt>> > "zxx" would be a decidedly bad choice for that column,
IMO, since </tt><br>
<tt>> > every single data element is definitely linguistic in nature.</tt><br>
<tt>> > </tt><br>
<tt>> > I don't know why people are so adverse to new special-purpose
code </tt><br>
<tt>> > elements when there is a reasonable need. It's not like there are
a </tt><br>
<tt>> > lot of different special-case semantics that are needed in
language-</tt><br>
<tt>> > tagging application scenarios; I think the set is very small, </tt><br>
<tt>> > perhaps even that this is the only important gap. I am *far* more
</tt><br>
<tt>> > concerned about overloading tags with distinct, orthogonal
semantics</tt><br>
<tt>> > for particular application scenarios ("und" means X in
this </tt><br>
<tt>> > application but Y in that application): *that* can lead to
serious trouble.</tt><br>
<tt>> > </tt><br>
<tt>> > As I think about this, I'm inclined to propose a new
special-purpose</tt><br>
<tt>> > ID "zrf" in ISO 639:</tt><br>
<tt>> > </tt><br>
<tt>> > ID: zxn</tt><br>
<tt>> > Reference name: language-neutral content</tt><br>
<tt>> > Comment: This ID is provided primarily for application scenarios</tt><br>
<tt>> > in which a language identifier
must be declared for</tt><br>
<tt>> > content that may be linguistic
in nature but that is</tt><br>
<tt>> > used as a language-neutral
identifier to reference or</tt><br>
<tt>> > index other information
objects.</tt><br>
<tt>> > </tt><br>
<tt>> > Uses of this code element do
not make any declaration</tt><br>
<tt>> > regarding the actual language
of a given data element</tt><br>
<tt>> > or of whether a given data
element is, in fact,</tt><br>
<tt>> > linguistic in nature.</tt><br>
<tt>> > </tt><br>
<tt>> > Note: for applications
scenarios in which an identifier</tt><br>
<tt>> > string is unambiguously
non-linguistic in nature, "zxx"</tt><br>
<tt>> > should be used rather than
"zxn".</tt><br>
<tt>> > </tt><br>
<tt>> > For example, in a database of
coding elements for</tt><br>
<tt>> > cultural objects that includes
for each such object a</tt><br>
<tt>> > code element such as an alpha-3
string (e.g., "abc")</tt><br>
<tt>> > and a reference name (e.g.,
"PIANO", "GUQIN"), the</tt><br>
<tt>> > language identifier applied to
the code element</tt><br>
<tt>> > should be "zxx",but
"zxn" may be applied to the</tt><br>
<tt>> > reference names.</tt><br>
<tt>> > </tt><br>
<tt>> > Applications may also use
"zxn" for content that is</tt><br>
<tt>> > Linguistic in nature but that
is represented in a</tt><br>
<tt>> > Language-neutral form. For
example, the concept 'ten'</tt><br>
<tt>> > Is linguistic in nature but can
be expressed in the</tt><br>
<tt>> > Language-neutral form
"10". Such use of "zxn" should</tt><br>
<tt>> > be considered only for
application scenarios that</tt><br>
<tt>> > have a particular need; this
usage is not recommended</tt><br>
<tt>> > in general. For instance, if a
software application</tt><br>
<tt>> > needs to segment the strings in
a document into items</tt><br>
<tt>> > that get passed to various
language-specific processes</tt><br>
<tt>> > and it must apply a language
identifier to language-</tt><br>
<tt>> > neutral content such as numbers
represented as digits,</tt><br>
<tt>> > then "zxn" may be
used within that application; but it</tt><br>
<tt>> > is not expected that content
authors would apply "zxn"</tt><br>
<tt>> > to numbers in their documents
in general.</tt><br>
<tt>> > </tt><br>
<tt>> > </tt><br>
<tt>> > </tt><br>
<tt>> > Peter</tt><br>
<tt>> > _______________________________________________</tt><br>
<tt>> > Ietf-languages mailing list</tt><br>
<tt>> > </tt></span><a href="mailto:Ietf-languages@alvestrand.no" target="_blank"><tt><span style="font-size: 10pt;">Ietf-languages@alvestrand.no</span></tt></a><span style="font-size: 10pt;"><br>
<tt>> > </tt></span><a href="http://www.alvestrand.no/mailman/listinfo/ietf-languages" target="_blank"><tt><span style="font-size: 10pt;">http://www.alvestrand.no/mailman/listinfo/ietf-languages</span></tt></a>
<br>
<br>
_______________________________________________<br>
Ietf-languages mailing list<u><span style="color: blue;"><br>
</span></u><a href="mailto:Ietf-languages@alvestrand.no" target="_blank">Ietf-languages@alvestrand.no</a><u><span style="color: blue;"><br>
</span></u><a href="http://www.alvestrand.no/mailman/listinfo/ietf-languages" target="_blank">http://www.alvestrand.no/mailman/listinfo/ietf-languages</a><br>
<br>
<br>
<br>
<br>
-- <br>
Mark </p>
</div></div></div>
</div>
</div>
</blockquote></div><br><br clear="all"><br>-- <br>Mark