millosh at gmail.com
Wed Jun 16 21:21:06 CEST 2010
On Wed, Jun 16, 2010 at 20:55, Peter Constable <petercon at microsoft.com> wrote:
> I gather you mean the latter. Assuming that to be the case, we need to distinguish between the question of whether there is a single "individual language" (i.e., range of language varieties deemed for practical purposes to be one language, different from other languages) and the question of whether (assuming a single language) there is one shared identity for that language. Note, for instance, that entries for individual languages in many cases list multiple names that may be used by different communities (e.g., Asturian / Bable / Leonese / Asturleonese).
Assuming that you want to make a formal machine translation engine.
You could do the next:
* Take, for example, Serbian. It is prescriptively useful, as it
allows the most of varieties.
* Make formal description of standard Serbian language ( 0:) ).
* Make correlation between Ekavian and Iyekavian standards. (Both are
* Take extra words from Croatian and Bosnian and mark them appropriately.
* Remove strictly Serbian words for Croatian.
* Limit to Croatian engine some syntactic forms acceptable in Serbian
* You have covered all standards.
So, linguistically speaking, if you have one standard covered, you'll
need 5% of extra efforts to cover all other standards.
BUT, naming is *the* problem. Probably, it won't be a big deal to make
some linguistic grouping. The most radical separatist wing, Croatian
linguists, are linguistically treating the language area as "Central
South Slavic", which has linguistic sense. But, it is a problem to
mark all standards as "Serbo-Croatian".
More information about the Ietf-languages