Criteria for languages?

Mark Davis ☕ mark at
Fri Dec 4 18:24:58 CET 2009

We are starting to get somewhere. It would help me if you would look over
the strawman criteria that I put out, just to see where we are agreeing or
not. Below, I substituted what you appear to have as a criterion (and also
fixed the omission that Randy noted). With these changes, is this what you
are thinking of?


A. If

   1. X is being encoded,
   2. *NEW: A major industry body has been tagging X as Y (rightly or
      1. *OLD: A reasonable person, based on information in the registry,
      could have tagged X-content as Y in the past*
   3. There is good evidence that a substantial amount of data has been so
   4. and X and the standard/predominent version of Y are not mutually
   comprehensible (at least to the degree that say Scots English and
   Mississippi English are)

Then Y should be made into a macrolanguage, and a new Z should be encoded to
represent the standard form of Y.

B. For matching, Y should match *Y, *X and Z. (X should match X, and Z
should match Z).

C. For lookup, Y should fetch content marked with Z. (X should fetch X, and
Z should fetch Z).


On Fri, Dec 4, 2009 at 08:41, Peter Constable <petercon at>wrote:

> From: ietf-languages-bounces at [mailto:
> ietf-languages-bounces at] On Behalf Of Mark Davis ?
> > A strict approach would be that if Latgalian is indeed a different
> > language from (mutually incomprehensible with) Latvian, then it
> > was incorrect to tag any Latgalian with "lav", and we just encode
> > a new language and move on. Same for Walliserdeutsch.
> That sounds entirely reasonable. It also sounded reasonable that Unicode
> should not encode any precomposed characters but rather use a
> dynamic-composition model. In both cases, legacy practice realistically
> keeps us from doing all the things that seem most reasonable. A major
> industry body has clearly been using "lav" for Latgalian (albeit this
> appears to have started only in the past 6 years); I'm not aware of
> indicators of any, let alone reasonably-widespread, use of either "de" or
> "gsw" for Walliserdeutsch, and so if Walliserdeutsch is deemed a separate
> language then I wouldn't saddle de or gsw with the hassles of a
> macrolanguage.
> Peter
-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Ietf-languages mailing list