Unilingua

Debbie Garside debbie at ictmarketing.co.uk
Sat Sep 17 11:43:10 CEST 2005


Tex wrote:

> If and when someone gives me a way to review a document and determine the
> proper language tag, and we all agree on the right tag, and it doesn't
> require three linguists to do the determination, I'll believe we have a
> system worth all these refinements.

The software is currently under development; a tool that can determine the
language used within a document. 

Best regards

Debbie Garside
CEO
Linguasphere ICT

> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-
> bounces at alvestrand.no] On Behalf Of Tex Texin
> Sent: 17 September 2005 03:04
> To: Doug Ewell
> Cc: ietf-languages at iana.org
> Subject: Re: Unilingua
> 
> Doug, yes please rethink those plans.
> 
> In another sphere we have a small number of character encodings, and we
> can't get software to properly identify the encoding in play. Why should
> we
> believe that with thousands of language codes available they will be used
> properly?
> 
> Even with the small number of codes we have today, I have difficulty
> determining which code properly describes a document. There are no
> guidelines or rules or ways to determine whether a document is one branch
> of
> a language versus another, except with the crudest of guesses. Various
> experts make pronouncements about Japanese being ja and not ja-jp, or latn
> not being required for en, since en is not generally represented in
> another
> script, but only an expert knows all of the possibilities and which
> circumstances never (or nearly never) occur, and which ones require
> additional descriptors or not. Given that is the case, I really don't need
> a
> more refined set of language choices.
> 
> If and when someone gives me a way to review a document and determine the
> proper language tag, and we all agree on the right tag, and it doesn't
> require three linguists to do the determination, I'll believe we have a
> system worth all these refinements. Oh, and I also need to believe the
> distinctions are something that my application may utilize.
> 
> I understand that for some very few purposes the ability to distinguish
> between thousands of languages is useful. I just don't see that most
> users,
> or most applications need it, and most content providers are incapable of
> correctly tagging their content. So I don't see why we should burden
> general
> applications with it.
> 
> So what good has it done that we have registered Boontling? For all the
> web
> pages and applications that do something with boontling, was the world
> really much better than if we had left them on their own with x-boontling?
> Is the world so much better that we registered boontling and denied or
> delayed es-americas?
> 
> The ISO 639 standards serve their purposes for linguists. The majority of
> software on the internet does not require this level of distinction and
> does
> not need to be burdened with it and I don't see that 3066bis will be
> deployed the way it has been envisioned.
> 
> tex
> 
> Doug Ewell wrote:
> >
> > We have registered tags for Boontling, Enochian, Mingo, and Scouse.
> >
> > In LTRU we are discussing, very seriously and purposefully, adding
> > support for ISO 639-3 in the future, which would add the next 6,700
> > languages and 350 "extended languages" that WEREN'T used widely enough
> > to justify an ISO 639-2 code element.
> >
> > We're also talking, at least peripherally, about supporting ISO 639-6 in
> > a still-later version.  That could add as many as 13,000 more codes for
> > almost every imaginable dialect and spoken or written variation.
> >
> > If registering one more constructed-language tag is going to cause
> > problems of scale, we'd better rethink some of those other plans.
> >
> > (BTW, it would have to be "x-uniling" or some such, due to length
> > constraints.)
> >
> > --
> > Doug Ewell
> > Fullerton, California
> > http://users.adelphia.net/~dewell/
> >
> > ----- Original Message -----
> > From: "Tex Texin" <tex at xencraft.com>
> > To: "Doug Ewell" <dewell at adelphia.net>
> > Cc: <ietf-languages at iana.org>
> > Sent: Friday, September 16, 2005 0:20
> > Subject: Re: Unilingua
> >
> > > Doug,
> > >
> > > (I am replying to your mail, but it is not directed at you
> > > personally.)
> > >
> > > Why do we want to register things that have no practical use or
> > > significance, for which there are almost no documents to give the tag
> > > to, and yet make our software tables larger and require more time to
> > > explain what it represents than the value of recognizing the code?
> > >
> > > Isn't it ok to have some number of documents for which we say, yes the
> > > contents are in a language which isn't covered by tags, so if you want
> > > a description it needs to be annotated in some other way.
> > >
> > > If somebody has a unilingua text they can label it with x-unilingua
> > > and note somewhere what it represents.
> > >
> > > We should reel the registry back into being something that internet
> > > engineers need for practical internet applications and have some form
> > > of 80/20 rule related to language categorization. I recognize the
> > > needs of linguists to distinguish languages with subtle but important
> > > differences, but I don't see that general software or internet
> > > applications should be burdened with the overhead. This has all got to
> > > fit in my watch someday. The registry should not be a museum for every
> > > possible variant that ever existed or was postulated. Maybe in
> > > addition to 50 documents to register a tag we should require there be
> > > 50 engineers that testify they care to recognize the distinction.
> > > (kidding, but only slightly...)
> > >
> > > tex
> 
> --
> -------------------------------------------------------------
> Tex Texin   cell: +1 781 789 1898   mailto:Tex at XenCraft.com
> Xen Master                          http://www.i18nGuy.com
> 
> XenCraft		            http://www.XenCraft.com
> Making e-Business Work Around the World
> -------------------------------------------------------------
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages



More information about the Ietf-languages mailing list