Picking and choosing

Doug Ewell dewell at roadrunner.com
Thu Jan 31 01:14:15 CET 2008


Frank Ellermann <nobody at xyzzy dot claranet dot de> wrote:

> Maybe you recall what Mr. Cowan and others had to say when there were 
> minor errors in the registry for a few days.

Here is my complete list of errors in Registry releases:

2006-03-08: Subtag 'fy' was listed twice, with two different 
descriptions.

2006-04-19: New multi-line comment for subtag 'GB' was added without 
leading spaces on second line; then "Type" field was deleted from 
following subtag 'GD'.

2006-10-17: Subtag 'AX' was added without proper escaping on Description 
field ("&#xC5;land Islands"); plain UTF-8 used instead.

2007-12-05: New Registry was released with 17 substantive changes, but 
File-Date record was not changed.

Each of these errors, except File-Date, was a syntactical error that 
could have caused a program reading the Registry to malfunction.  The 
File-Date error could have misled a program into thinking it had the 
latest version when it did not.  These are much more serious problems 
than if the Registry includes a valid subtag for a bogus or doubtful 
entity such as Europanto.

I am starting to think 'eur' is like one those non-existent streets that 
mapmakers add to their maps to catch plagiarists.  They are bogus, but 
don't usually cause any real harm.

>> Much less wise is thinking that it would be feasible for us to sift 
>> through 7500 entries and determine which are "right" and which are 
>> "wrong".
>
> It is straight forward to design a policy where only ISO 639-2 
> languages are registered (as is), and otherwise languages can be 
> requested here (as is), taking either an existing 639-3 code (that 
> would be new), or the "registered language" loophole (as is).

It is straightforward to design such a policy, but it would be almost 
impossible to administer it.  The purpose of incorporating existing ISO 
standards in the first place was to avoid having to review and register 
each item individually.

Peter's point remains: more than 7,000 languages are encoded in 639-3 
but not in 639-2, and what is proposed here is to require prospective 
users to request each one individually, then require the Reviewer and 
ietf-languages to duplicate the work of ISO 639-3/RA in deciding whether 
each one is good enough for the Registry.

Requiring prospective users to request variant subtags for dialects and 
orthographies and such is much more appropriate, because those 
categories are open-ended and there is no other existing standard that 
attempts to catalog them.

> Maybe a hypothetical successor of RFC 4646 closes this loophole, as 
> former LTRU contributor I proposed to close it stating "not good 
> enough for ISO 639-3 is also not good enough for IANA" as reason.

Evidently Europanto was good enough for ISO 639-3.

Per Michael's request, I am finished with this thread on ietf-languages. 
Any changes to 4645bis will have to be the result of rough consensus on 
LTRU, as always.

--
Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14
http://www.ewellic.org
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages  ˆ



More information about the Ietf-languages mailing list