Criteria for languages?
cowan at ccil.org
Wed Dec 2 01:55:27 CET 2009
Caoimhin O Donnaile scripsit:
> The question of how precise to be with language labels is
> a serious practical one. In particular, I have run into
> the the issue of whether to label Norwegian as
> Bokmål (nb) or Nynorsk (nn), or just label it as Norsk (no)
> which is now a macrolanguage. I have been developing a
> facility, http://www.smo.uhi.ac.uk/wordlink/, which links
> webpages automatically with online dictionaries, and so I need
> to label each dictionary in the database of dictionaries.
> I want to be as precise as possible, but on the other hand,
> if I understand things correctly without knowing much Norwegian,
> a Bokmål dictionary will at least work to some extent with a
> Nynorsk text so I don't want to exclude it completely by having
> a completely different label. In cases like this of very closely
> related "languages", I maybe need a concept of "language distance",
> so that a Bokmål dictionary, rather than being completely excluded
> for Nynorsk, would just get a lot of negative points. (I already
> have a "quality/usefulness" points system for dictionaries.)
Are these bilingual dictionaries, or are they meant for native speakers?
Native speakers don't need dictionaries for nn vs. nb. Foreigners would
need the right kind of dictionary to get useful results, as too many lookups
John Cowan http://www.ccil.org/~cowan cowan at ccil.org
The native charset of SMS messages supports English, French, mainland
Scandinavian languages, German, Italian, Spanish with no accents, and
GREEK SHOUTING. Everything else has to be Unicode, which means you get
nly 80 16-bit characters in a text instead of 160 7-bit characters.
More information about the Ietf-languages