Criteria for languages?

Peter Constable petercon at
Thu Dec 3 04:32:34 CET 2009

While this sounds like a manageable operational principle, I fear it might lead us to create macrolanguage entries more often than would be desirable.

IMO, macrolanguages are generally not desirable. Case in point: the messiness they can create led us to debate for months/years over Chinese, and in the end we barely arrived at a compromise that left nobody thrilled (and it remains to be see to what extent interop is less than all we might like to see).

With that in mind, I don’t as much concern about a lack of consistently-applied principles; the principle I’d hope for is to avoid them if possible, and consider on a case-by-case basis if there is legacy that in some way prevents us from doing so.

A side note: as John has observed, it is not for us to make policy on creation of macrolanguages; the most we can do is agree on a recommendation we’d like to relate to the RAs / JAC / TC37 regarding such a policy.


From: ietf-languages-bounces at [mailto:ietf-languages-bounces at] On Behalf Of Mark Davis ?
Sent: Wednesday, December 02, 2009 1:37 PM
To: Randy Presuhn
Cc: ietf-languages at
Subject: Re: Criteria for languages?

I have nothing against the macrolanguage concept per se.

What I do find very troublesome is that the application of it seems fairly ad hoc, with little clarity as to why it is used in one case (Latvian) and not in another (Swiss German).

If there were a consistent policy for it, then it could be usefully applied, and implementations could anticipate what they need to do for now and the future. I'll throw out a strawman:

A. If

  1.  X is being encoded,
  2.  A reasonable person, based on information in the registry, could have tagged X-content as Y in the past
  3.  There is good evidence that a substantial amount of data has been so tagged,
  4.  and X and the standard/predominent version of Y are not mutually comprehensible (at least to the degree that say Scots English and Mississippi English are)
Then Y should be made into a macrolanguage, and a new Z should be encoded to represent the standard form of Y.

B. For matching, Y should match both X and Z. (X should match X, and Z should match Z).

C. For lookup, Y should fetch content marked with Z. (X should fetch X, and Z should fetch Z).


On Wed, Dec 2, 2009 at 12:27, Randy Presuhn <randy_presuhn at<mailto:randy_presuhn at>> wrote:
Hi -

> From: "John Cowan" <cowan at<mailto:cowan at>>
> To: "Peter Constable" <petercon at<mailto:petercon at>>
> Cc: <ietf-languages at<mailto:ietf-languages at>>; "John Cowan" <cowan at<mailto:cowan at>>
> Sent: Wednesday, December 02, 2009 11:43 AM
> Subject: Re: Criteria for languages?
> >             2.  'Extlang' records SHOULD NOT be created for languages if
> >                 other languages encompassed by the macrolanguage do not
> >                 also include 'extlang' records.
> I interpret the main clause in this sentence as applying if there are as yet
> no other co-encompassed languages; you and Addison interpret it as
> not applying in that case.  Consulting my local Talmudist produced a
> definite maybe.
Taken by itself the quoted text would avoid the situation where some,
but not all, of the languages encompassed by a given macrolanguage
would include 'extlang' records.  The sentence immediately following it
(not quoted) is instructive:

               For example, if a new
               Serbo-Croatian ('sh') language were registered, it would
               not get an extlang record because other languages
               encompassed, such as Serbian ('sr'), do not include one
               in the registry.
This leads to the question of whether such a situation could also be
resolved by adding extlang records for all those other languages,
thus satisfying the requirement.

What bothers me most about the Latvian case is that while it may
have the same gestalt as zh, there is a huge difference in degree.
As I understand it, only a need to distinguish Latvian and Latgalian
has been identified, and there doesn't seem to be much expectation
for much else to be encompassed by a Latvian macrolanguage.
It seems that designating Latvian as a macrolanguage is serious
overkill in this situation, and that all would be better served by treating
Latgalian as a variant.  I'd love to hear from someone with first-hand
knowledge of these languages.


Ietf-languages mailing list
Ietf-languages at<mailto:Ietf-languages at>

-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Ietf-languages mailing list