Criteria for languages?

Mark Davis ☕ mark at macchiato.com
Wed Dec 2 22:37:25 CET 2009


I have nothing against the macrolanguage concept per se.

What I do find very troublesome is that the application of it seems fairly
ad hoc, with little clarity as to why it is used in one case (Latvian) and
not in another (Swiss German).

If there were a consistent policy for it, then it could be usefully applied,
and implementations could anticipate what they need to do for now and the
future. I'll throw out a strawman:

A. If

   1. X is being encoded,
   2. A reasonable person, based on information in the registry, could have
   tagged X-content as Y in the past
   3. There is good evidence that a substantial amount of data has been so
   tagged,
   4. and X and the standard/predominent version of Y are not mutually
   comprehensible (at least to the degree that say Scots English and
   Mississippi English are)

Then Y should be made into a macrolanguage, and a new Z should be encoded to
represent the standard form of Y.

B. For matching, Y should match both X and Z. (X should match X, and Z
should match Z).

C. For lookup, Y should fetch content marked with Z. (X should fetch X, and
Z should fetch Z).

Mark


On Wed, Dec 2, 2009 at 12:27, Randy Presuhn <randy_presuhn at mindspring.com>wrote:

> Hi -
>
> > From: "John Cowan" <cowan at ccil.org>
> > To: "Peter Constable" <petercon at microsoft.com>
> > Cc: <ietf-languages at iana.org>; "John Cowan" <cowan at ccil.org>
> > Sent: Wednesday, December 02, 2009 11:43 AM
> > Subject: Re: Criteria for languages?
> ...
> > >             2.  'Extlang' records SHOULD NOT be created for languages
> if
> > >                 other languages encompassed by the macrolanguage do not
> > >                 also include 'extlang' records.
> >
> > I interpret the main clause in this sentence as applying if there are as
> yet
> > no other co-encompassed languages; you and Addison interpret it as
> > not applying in that case.  Consulting my local Talmudist produced a
> > definite maybe.
>
> Taken by itself the quoted text would avoid the situation where some,
> but not all, of the languages encompassed by a given macrolanguage
> would include 'extlang' records.  The sentence immediately following it
> (not quoted) is instructive:
>
>                For example, if a new
>                Serbo-Croatian ('sh') language were registered, it would
>                not get an extlang record because other languages
>                encompassed, such as Serbian ('sr'), do not include one
>                in the registry.
>
> This leads to the question of whether such a situation could also be
> resolved by adding extlang records for all those other languages,
> thus satisfying the requirement.
>
> What bothers me most about the Latvian case is that while it may
> have the same gestalt as zh, there is a huge difference in degree.
> As I understand it, only a need to distinguish Latvian and Latgalian
> has been identified, and there doesn't seem to be much expectation
> for much else to be encompassed by a Latvian macrolanguage.
> It seems that designating Latvian as a macrolanguage is serious
> overkill in this situation, and that all would be better served by treating
> Latgalian as a variant.  I'd love to hear from someone with first-hand
> knowledge of these languages.
>
> Randy
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/ietf-languages/attachments/20091202/cb6895b0/attachment.htm 


More information about the Ietf-languages mailing list