ISO 639-3 changes
doug at ewellic.org
Mon Jan 19 22:41:47 CET 2015
ISO 639-3/RA published their annual change set last week. I will be
posting records and registration forms for changes to the Registry in
the coming weeks, sooner if time permits.
As usual, there are new code elements, retirements, merges, and splits,
all of which affect the Registry in one way or another. The PDF report
posted by the RA says that 42 changes were approved, a bit fewer than in
previous years, although there seem to be a few more than that. No
macrolanguages or extlangs are affected this time around.
There are a couple of new issues this year. One involves the use of
"click" letters ǀ, ǁ, ǂ, and ǃ for certain African languages instead
of ASCII fallbacks. The RA has added a code element for ǂUngkue, using
the real letter ǂ instead of a fallback like =/Ungkue, because the
database used by the RA now supports them. The RA is not changing the
spelling of any existing names for now. We have always used the ASCII
fallbacks in the Registry because they were what the RA used; now may be
the time to reopen the question of whether to use the real letters
uniformly, use ASCII fallbacks uniformly, or include both.
A greater problem is that the 639-3 data files, for the first time,
include the alpha-2 code element 'zg' for Standard Moroccan Tamazight,
which already has the alpha-3 code element and BCP 47 subtag 'zgh'.
According to the 639-3 Registrar, 'zg' was approved at the same time as
'zgh' (November 2012) but was not listed in the 639-3 data files or
website due to a clerical error. However, it is also not listed on the
639-2 site or data files, which are the official online resources for
This is similar to the situation that existed between 1999 and 2003,
before RFC 4646 and the Registry, when 14 alpha-2 code elements were
assigned but not listed on the official websites, and suddenly popped up
after users had begun using the alpha-3 equivalents. One reason we have
a BCP 47 Registry today is to prevent this sort of instability from
affecting language tagging.
RFC 5646, Section 2.2.1 says clearly that if 639-1 adds an alpha-2 code
element to a language that already has an alpha-3, the 2-letter subtag
will not be added to the Registry. It doesn't specify what happens if
the alpha-2 has supposedly been assigned for years but was not published
on freely available resources (I haven't purchased updates of any of the
639 standards, and wasn't supposed to need them).
I am assuming, because 'zg' was not publicized until users had already
had two years to begin using 'zgh' (Microsoft and CLDR, for example, are
already doing so), that BCP 47 will have to ignore the assignment of
'zg' entirely and treat it as an anomaly in the 639-3 data files. This
will surprise some users who still think the ISO websites and data files
are the source for BCP 47 subtags at the user level, but for the rest of
us, it will serve to remind us why they aren't.
A third issue this year could be the continued unavailability of online
archives of this mailing list. Reviewers might expect to be able to
refer to the archives when reviewing records and registration forms for
42+ changes. I'm hopeful that this situation will not disrupt anyone's
ability to perform the necessary reviewing.
Doug Ewell | Thornton, CO, USA | http://ewellic.org
More information about the Ietf-languages