Ietf-languages Digest, Vol 50, Issue 15

Mark Davis mark.davis at icu-project.org
Thu Feb 15 17:55:26 CET 2007


Your quotation below omits the true author, and may leave the impression
that I wrote a number of paragraphs that I do not agree with and did not
write. I only wrote "Assume that old Czech ..." -- someone else wrote the
"But is this a real problem...."

> Mark Davis wrote:
>
> > Assume that old Czech is as different from modern as fro is from fr.
>
> But is this a real problem?  How much total literature is written
...

That being said, there are two models that ISO could be using.

   1. Overlapping. 'eng' means any English, modern or historic. 'ang'
   means specifically Old English, a subset of 'eng'. 'ces' means any Czech.
   There is no tag specifically for Old Czech.
   1. so I could tag Beowulf with 'ang' or 'eng', but Shakespeare,
      Austen, and Robin Williams only with 'eng'.
      2. Smil Flaška z Pardubic and Václav Havel are both tagged with
      'ces'.
      3. Requests for BCP 47 variant tags for Shakespearean English
      (en-SHAKESPR) or old Czech (cs-OLDCZECH) would be legitimate.
      4. A request for a variant tag for only modern English
      (en-MODENGL), thus excluding Old English, would be legitimate.
      2. Disjoint. 'eng' means only modern English, 'ang' means Old
   English, 'ces' means only modern Czech. There is no tag at all (currently)
   for Old Czech.
      1. so I could tag Beowulf with 'ang' only.
      2. and there is no valid current code for tagging for Smil
      Flaška z Pardubic
      3. A request for BCP 47 variant tags for Shakespearean English
      (en-SHAKESPR) would be legitimate
      4. A request for a registered old Czech language tag (oldczech)
      would be legitimate. (However "primary languages are strongly
      RECOMMENDED for registration with ISO 639, and proposals rejected by ISO
      639/RA will be closely scrutinized before they are registered with IANA."
      )

I don't think they are using model number one, but we need to find out.

Mark

On 2/15/07, Anthony Aristar <aristar at linguistlist.org> wrote:
>
> With all due respect, this seems like a very odd discussion from my
> perspective  as a linguistics professor.  The discussion seems to
> presuppose that all that matters is whether Microsoft is going to one
> day produce a version of Word in Middle High German or Old English, or
> how many texts exist in a language.
>
> But the ISO 639 codes are used for much more than this.  In particular,
> they are used to ensure interoperability, allowing material of the same
> linguistic nature to be found in searches, and to be compared using the
> linguistic ontologies that are now being developed.  If I am a scholar
> searching for texts in Old English (or Old High German, for that
> matter) and everyone has been cavalier enough to code such material
> with eng and deu, what the search engines return will be utterly
> useless to me.  I am going to be flooded with such a quantity of
> material in Modern English and Modern German that searching through it
> will be essentially impossible.
>
> So if you really believe that it doesn't matter if you code English
> material as eng, whatever its period, what you're really saying is that
> you don't really care about interoperability, and that you don't really
> care about scholarship.
>
>                 **************************************
> Anthony Aristar, Director, Institute for Language Information & Technology
>                    Professor of Linguistics
> Moderator, LINGUIST               Principal Investigator, EMELD Project
> Linguistics Program
> Dept. of English                  aristar at linguistlist.org
> Eastern Michigan University            2000 Huron River Dr, Suite 104
> Ypsilanti, MI 48197
> U.S.A.
>
> URL: http://linguistlist.org/aristar/
>                 **************************************
>
> > Mark Davis wrote:
> >
> > > Assume that old Czech is as different from modern as fro is from fr.
> >
> > But is this a real problem?  How much total literature is written
> > and available in different variations of Czech?  My prejudice says
> > that as a nation with a language and literature of its own, Czech
> > is about as young as Finnish, Norwegian or Serbian, i.e. 19th
> > century.  Can you give any concrete examples when not having a
> > separate *code* for pre-renaissance Czech is a practical problem?
> >
> > Linguists of course have *names* for Swedish of all ages, but I
> > see no real use for having ISO or the IETF specify language
> > *codes*.  I could be wrong, but if so please enlighten and correct
> > me.  Nobody is going to translate OpenOffice or Mozilla to the
> > language spoken by vikings (Old Norse) or the Swedish used during
> > the Lutheran reformation (called New Swedish, ironically).
> >
> > Yes, there is now a branch of Wikipedia in Old English
> > (ang.wikipedia.org), but that is a rare exception.  I don't expect
> > this to happen in other languages.  Ang has now 744 articles,
> > compared to the 11,000 articles of the Latin Wikipedia.
>
>
>
>
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>



-- 
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/ietf-languages/attachments/20070215/96cf2c53/attachment-0001.html


More information about the Ietf-languages mailing list