[Ltru] Re: "mis" update review request

Mark Davis mark.davis at icu-project.org
Fri Apr 20 17:59:15 CEST 2007


I don't think the programming language fragment is really a boundary
condition. Most code source nowadays are not just random hex, there
typically, not exceptionally, some real linguistic content. I would agree
with you that a hex dump of a compiled program, such as perhaps you used for
your example, is sensible to tag as zxx, but based on the wording of the
standards, I don't think we can expect zxx to apply to typical code source.
Yet, while there may be is some embedded English, we don't want to call it
"en" either.

It looks to me like the best choice currently would be "und"; as I said, I
think it might be useful to have a special tag for this just because it is a
reasonably common case that is otherwise difficult to categorize. An
alternative would be to explicitly broaden the description of "zxx" to be
"no linguistic content, or programming source code". That would be a
compatible change to 4646bis, since it is a broadening.

Mark

On 4/20/07, Peter Constable <petercon at microsoft.com> wrote:
>
>   *From:* Mark Davis [mailto:mark.davis at icu-project.org]
>
> **
>
> *> *As in example #9 of http://docs.google.com/Doc?id=dfqr8rd5_11g425c9 ,
>
> > to think that the following contains "no linguistic content" is bizarre.
>
>
> > It obviously contains linguistic content.
>
> if (linguisticContent == null) { throw new Exception(""); }
>
>
>
>  You could say the same of this:
>
> MZ
> ------------------------------
>
> ------------------------------
>    ÿÿ  ¸       @                                   à
> ­º
>  ´            Í!¸LÍ!This program cannot be run in DOS mode.
>
> $       Tbï›
> ------------------------------
> È
> ------------------------------
> È
> ------------------------------
> È7ÅïÈ
> ------------------------------
> È7ÅüÈ
> ------------------------------
> È7ÅúÈ
> ------------------------------
> È
> ------------------------------
> €ÈÉ
> ------------------------------
> È7ÅìÈ3
> ------------------------------
> È7ÅýÈ
> ------------------------------
> È7ÅùÈ
> ------------------------------
> ÈRich
> ------------------------------
> È
>
>
>
> We could probably come up with all kinds of boundary cases for which there
> is no "right" answer. I don't know what use it would be.
>
> Peter
>
> _______________________________________________
> Ltru mailing list
> Ltru at ietf.org
> https://www1.ietf.org/mailman/listinfo/ltru
>
>


-- 
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/ietf-languages/attachments/20070420/953634ff/attachment-0001.html


More information about the Ietf-languages mailing list