[Ltru] Re: Macrolanguages, countries & orthographies

Peter Constable petercon at microsoft.com
Fri Feb 16 17:30:58 CET 2007


> From: CE Whitehead [mailto:cewcathar at hotmail.com]


> >Whether you can read it easily is completely irrelevant. Tell me one
> >application scenario in which it would make sense to consider English and
> >Tok Pisin to be the same language.
>
> I have none for Tok Pisin and English unless someone has got them both mixed
> together in a document in such a way that it makes sense encoding the
> primary text processing language as a macrolanguage or a collection of
> languages that includes both!

Using two languages in a document doesn't make them the same language. If someone wishes to author a document correctly tagging the primary language and ignoring the fact that there are bits in other languages, that's their prerogative; it doesn't mean that the tag they use has to mean some combination of all the languages involved.


> In the case of Middle French and Early Modern French, there are points at
> which they seem to be about perfectly mixed in documents; the cut-off dates
> for one or the other are indeed a bit abitrary as the cut-off also depends
> on the locale and the writer's background.

I'm not an expert on historic French varieties so won't engage in discussing what cut-off dates are useful. The research librarians of the world have found a particular date to be useful for themselves and for the users of their libraries. Languages always have fuzzy boundaries; this is just one more instance of that. Researchers focused on language varieties that lie at the boundaries are always going to face a question as to how to annotate their data, whether that's a diachronic or synchronic boundary. That doesn't mean that it isn't still useful to code two discrete entities that are useful distinctions for a lot of other users.


> I do think in addition that it would be a good idea that, when there are
> clearly related languages, that they could also be encoded as a collection
> at least if not a macrolanguage

If people have a legitimate need that does not bend the system to suit a small group at the expense of the majority, then those needs can be considered for coding as appropriate.


> --once again I am left reviewing the
> definitions of these, if there is a document where the varieties are mixed;
> I do still somehow feel that, in addition to creating a registry of all the
> sil languages, this group should be a bit receptive to the needs of
> encoders, especially when the languages are ancient as ancient languages are
> not part of the sil repertoire.)

SIL sub-contracts (using that term informally) historic and ancient languages to the Linguist List staff.



Peter Constable


More information about the Ietf-languages mailing list