Pending requests

Kent Karlsson kent.karlsson14 at
Fri Nov 27 07:06:04 CET 2015

Den 2015-11-27 03:38, skrev "Michael Everson" <everson at>:

> On 26 Nov 2015, at 23:31, Kent Karlsson <kent.karlsson14 at> wrote:
>>> The specific simple variety of English for which a subtag has been sought is
>>> precisely the one used on the Wikipedia, as defined there. en-wpsimple is
>>> well-defined by the Wikipedia.
>> Yes, but as I said: it is (more or less) for Wikipedia only (and then only
>> for one or a few languages). It is basically useless for anyone else.
> The Wikipedia is a big and important application, is it not?

Yes, so. en-levelB2 (if that corresponds to what they are targeting,
that would work just fine for Wikipedia and many others. That Wikipedia
has house rules precising what the level is fine, but nothing "we" should
encode. Likewise for VoA "Learning English" levels (which I do think can
be found correspond to CEFR levels). They have house rules (I'd assume,
though they don't appear to have published them), but "we" should not
attempt to embody the house rules in LSR.
>> [Regarding the CEFR scheme]
>>> I think that ranking levels of simplicity is way outside the scope of
>>> our project. 
>> In that case, "Ogden's Basic English" and 'wpsimple' would be "way outside
>> the scope of our project" as well.
> No, they aren¹t. Our subtags describe linguistic entities, not hierarchies of
> language-learning and speaker competence.

Language-learning and speaker competence levels of a particular language are
linguistic entities as well. "wpsimple" is just one instance.

>>> ISO 639 is codes for the representation of names of languages.
>>> Our subtags are too, just at a different level of granularity.
>> "<language X> 'subset' at level <n>"; e.g. 'es-levelB1' would be the tag
>> for "Spanish intended for readers/listeners at CEFR proficiency level B1".
>> That would be just as fine as "en-ogden" tag for "English according to
>> Ogden's  Basic English", and the former would be *way* more useful,
>> and the scheme is useable for any language, amplifying the usefulness
>> of CEFR level variant subtags.
> Scouse is Scouse. Basic English is Basic English. Basic English differs from
> Wikipedia¹s Simple English and if there are other controlled vocabularies or
> forms then they differ from both. It is wrong-headed to try to figure out
> which ³CEFR proficiency level² Ogden¹s Basic English matches, because it

I never said it did. Ogden¹s Basic English just appears strange.

> doesn¹t match any of them. It is defined by its own definitions, and CEFR
> learners of English are not constrained by Basic English¹s rules.

Of course.

> en-scouse points directly at Scouse. en-cornu points directly at
> Cornu-English/Anglo-Cornish/Cornish English. en-basiceng would point directly
> at Basic English. CEFR hierarchies have nothing to do with this. Our subtags
> point at things. I don¹t think it is within our scope to pick a set of CEFR
> definitions and attempt to apply them (on the basis of no research) to one or
> more varieties of controlled vocabulary and syntax. The CEFR is ALL about
> learner competence with regard to standard language, and Basic English and
> Wikipedia Simple English are examples of controlled language (engineered
> language, not constructed language), not examples of standard language.

Disregarding Ogden's Basic English (which must NOT get the subtag
'basiceng'), which is not "simplified English", but rather a (strangely)
"constrained English".

No, I don't say that *WE* should attempt to apply CEFR levels to simplified
form so-and-so from Wikipedia, VoA, or anyone else. That should be up to
Wikipedia, VoA, and anyone else (respectively) [and these I do think can
be reasonably mapped; by THEM, not us]. But I do not find it
appropriate to encode house rules for company/organisation so-and-so
in LSR, regardless of how big they are. BUT we should cater for this
use-case (simplified, or learner's, language) in LSR, but in a general
manner, not house rules. The latter are of course needed, but a matter
for each "house" (company/organisation), not the LSR.

/Kent K

More information about the Ietf-languages mailing list