Points 3, 4 and 2 [RE: About: Tags for Identifying Languages
Addison Phillips [wM]
aphillips at webmethods.com
Mon Mar 8 20:02:49 CET 2004
Thanks for the note. I'm sorry if it looked like I was ignoring your points
(I didn't think I was, but then I also thought that the draft would answer
some of your questions as well)
See me interlinear comments below.
Addison P. Phillips
Director, Globalization Architecture
webMethods | Delivering Global Business Visibility
Chair, W3C Internationalization (I18N) Working Group
Chair, W3C-I18N-WG, Web Services Task Force
Internationalization is an architecture.
It is not a feature.
> -----Original Message-----
> From: John Clews [mailto:scripts20 at uk2.net]
> Sent: lundi 8 mars 2004 01:45
> To: aphillips at webmethods.com
> Cc: scripts20 at uk2.net; ietf-languages at alvestrand.no
> Subject: Points 3, 4 and 2 [RE: About: Tags for Identifying Languages
> Thanks Addison for your reply, and thanks to Mike Ksar for confirming the
> date which needs correcting in the draft.
> Could you also reply to my original points 3 and 4 (and 2) below, which
> you didn't cover in your reply.
> >> 3. In my view, it would also do well to allow inclusion of the widely
> >> used LOCODEs to specify locations.
> These are specified in UN/ECE RECOMMENDATION 16: UN/LOCODE (UNITED NATIONS
> CODE FOR TRADE AND TRANSPORT LOCATIONS)
> and available from the UN/CEFACT site.
> This would allow much easier specification of place in specifying language
> variants, e.g. for the Martha's Vineyard version of sign language, to
> incorporate the LOCODE string
> within the language tag.
We didn't consider LOCODEs in the design of draft-01. I haven't looked that
closely at them. The M49 materials cover the immediate needs that Mark and I
were dealing with.
I don't personally care for LOCODE, which is a bit too specific for my
tastes. It also incorporates all the problems that ISO3166 has (WRT
stability and ambiguity) since it uses ISO3166 as a basis.
> >> 4. In my view, it would also do well to refer to ISO 639-3 codes,
> >> once that gets passed.
> That is under development. Will that be covered when, as you put it, the
> draft RFC will "advance to the next stage of standardization (or be
> revised so that it can so progress?" You didn't mention that in your
Not unless ISO639-3 advances before the draft does. It isn't realistic to
incorporate ISO639-3 normatively before it exists. And probably it would be
a good idea to see that ISO639-3 meets the various requirements they've set
out for themselves (with regard to compatibility, stability, etc.
etc.)---i.e. examine the results, before incorporating that standard.
I'm not suggesting anything untoward about ISO639-3. I'm merely noting that
incorporating that standard into RFC3066 should be done on the basis of the
All we can do now it provide support for it (which is what the
whole -s-extlang stuff is about). When ISO639-3 is done or nearing
completion, RFC3066:bis can itself be revised to incorporate ISO639-3
normatively. I certainly hope we're not still working on various 3066:bis
drafts in a year when that takes place!
> In addition, you also mention:
> > The newest draft includes UN M49 codes.
> That's true, though this is a workaround for what I raised in my point 2:
I don't see why M49's utility is reduced by that.
> >> 2. It covers the CS/CS problem well in dealing with ISO 3166 codes
> >> (though naturally it would be better if the ISO 3166/MA didn't do such
> >> stupid things - has anybody heard of top-level actions regarding the
> >> allocation of the CS code in ISO 3166?)
> There needs to be some rationale for why the code YU would not do, or
> whether YU would do in certain instances. In the case of both YU and CS,
> there are smaller entities that exist instead of the larger entities which
> those originally represented (as is also the case for SU).
You can use YU if you want to. It might not be a good idea to do so (you may
be offending someone), but it is permitted by rfc3066:bis. 'YU' is a code
for a (defunct) country.
> However, there could be legitimate historical reasons for having cs-CS as
> well as sr-YU, so why is a string of numerical digits listed as the only
Numeric subtags from M49 are an option only in the case where ISO3166
assigns a country an alpha-2 code that was previously assigned to another
country. M49 is advertised as being stable and consistent, therefore Mark
and I (with support from others on this list) incorporated it as a way of
tagging content for countries that have the misfortune to get assigned a
> And how would people know when to use digit codes rather than
> 2-letter codes?
It's very clear in the draft: there will be an informative registration in
the IANA registry. In addition, the class of super- and sub-national codes
from M49 can be used at any time. Otherwise you MUST use the ISO3166 alpha2.
This is comparable to requiring the use of the ISO639-1 alpha2 codes instead
of the ISO639-2 alpha3 codes where they exist.
> And is there any software etc which specifies using 2-letter codes, which
> would invalidate use of 3-digit codes?
There is plenty of software that assumes that region codes all take the
alpha2 form. This software will not be able to store the 3-digit code. But
then, these applications won't be able to deal with reassignment of codes
very well either (the reason for moving to M49).
Really, RFC3066:bis makes all of this quite a bit easier to deal with. In
the past it was possible for there to be registrations (with lengths != 2)
with regional meanings. And you can have (as with the sign language codes)
two or more subtags with some kind of regional meaning. RFC3066:bis does
away with that. There are ISO3166 alpha2 codes and, in isolated cases, UN
M49 codes. Registrations can be made that have "regional" meanings, but
these will be limited to the "variant" slot in the tag.
> John Clews,
> Keytempo Limited (Information Management),
> 8 Avenue Rd, Harrogate,
> HG2 7PG
> Tel: +44 1423 888 432 (landline)
> Tel: +44 7766 711 395 (mobile)
> Email: scripts20 at uk2.net
More information about the Ietf-languages