Criteria for languages?

Mark Davis ☕ mark at macchiato.com
Fri Dec 4 23:19:58 CET 2009


The objective here is to see whether there are any reasonably objective
criteria behind the *application* of macrolanguages (not the concept, but
the application), or whether it is a haphazard process, where the
application is not well grounded. The answer can make a difference as to
what we do in ietf-languages with that information.

My first take was that it was haphazard, because in and of
themselves Walliserdeutsch and  Latgalian are parallel.

But if the relevant difference, according to what I think you are saying, is
that

   1. Walliserdeutsch doesn't have a "major industry body" that has tagged a
   substantial amount of data incorrectly as Swiss German.
   2. Latgalian has a "major industry body" that has tagged a substantial
   amount of data incorrectly as Latvian

Then that is at least a workable, reasonably objective criterion. And from
your statement, I assume that that is the criterion that ISO is using in
this case.

Mark


On Fri, Dec 4, 2009 at 13:44, Peter Constable <petercon at microsoft.com>wrote:

> Mark: what is the objective here?
>
>
>
> Peter
>
>
>
> *From:* mark.edward.davis at gmail.com [mailto:mark.edward.davis at gmail.com] *On
> Behalf Of *Mark Davis ?
> *Sent:* Friday, December 04, 2009 9:25 AM
> *To:* Peter Constable
> *Cc:* Randy Presuhn; ietf-languages at iana.org
>
> *Subject:* Re: Criteria for languages?
>
>
>
> We are starting to get somewhere. It would help me if you would look over
> the strawman criteria that I put out, just to see where we are agreeing or
> not. Below, I substituted what you appear to have as a criterion (and also
> fixed the omission that Randy noted). With these changes, is this what you
> are thinking of?
>
>
>
> ====
>
>
>
> A. If
>
>    1. X is being encoded,
>    2. *NEW: A major industry body has been tagging X as Y (rightly or
>    wrongly)*
>
>
>    1. *OLD: A reasonable person, based on information in the registry,
>       could have tagged X-content as Y in the past*
>
>
>    1. There is good evidence that a substantial amount of data has been so
>    tagged,
>    2. and X and the standard/predominent version of Y are not mutually
>    comprehensible (at least to the degree that say Scots English and
>    Mississippi English are)
>
> Then Y should be made into a macrolanguage, and a new Z should be encoded
> to represent the standard form of Y.
>
>
>
> B. For matching, Y should match *Y, *X and Z. (X should match X, and Z
> should match Z).
>
>
>
> C. For lookup, Y should fetch content marked with Z. (X should fetch X, and
> Z should fetch Z).
>
>
>
> Mark
>
> On Fri, Dec 4, 2009 at 08:41, Peter Constable <petercon at microsoft.com>
> wrote:
>
> From: ietf-languages-bounces at alvestrand.no [mailto:
> ietf-languages-bounces at alvestrand.no] On Behalf Of Mark Davis ?
>
>
> > A strict approach would be that if Latgalian is indeed a different
> > language from (mutually incomprehensible with) Latvian, then it
> > was incorrect to tag any Latgalian with "lav", and we just encode
> > a new language and move on. Same for Walliserdeutsch.
>
> That sounds entirely reasonable. It also sounded reasonable that Unicode
> should not encode any precomposed characters but rather use a
> dynamic-composition model. In both cases, legacy practice realistically
> keeps us from doing all the things that seem most reasonable. A major
> industry body has clearly been using "lav" for Latgalian (albeit this
> appears to have started only in the past 6 years); I'm not aware of
> indicators of any, let alone reasonably-widespread, use of either "de" or
> "gsw" for Walliserdeutsch, and so if Walliserdeutsch is deemed a separate
> language then I wouldn't saddle de or gsw with the hassles of a
> macrolanguage.
>
>
>
> Peter
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/ietf-languages/attachments/20091204/8a67611c/attachment.htm 


More information about the Ietf-languages mailing list