Proposed Successor to RFC 3066 (language tags)

Addison Phillips [wM] aphillips at
Sun Nov 23 19:15:25 CET 2003

Hi John,

I've always taken 'zh' to mean "Chinese", which is, as we all recognize,
pretty non-specific. I think that's actually the ISO639 position, which
rfc3066 inherits.

I think it is also well recognized that this is not really adequate for
anyone's needs in tagging Chinese languages and dialects. At least with the
'new regime' we are able to be highly specific--script, historical
association, geographical location, and anything that can be a "property" of
a language--which should address most needs.

Sub-language tag registration really bothers me, though. A glance at the
IANA registry shows a large-ish number of tags that were registered and then
deprecated because ISO639 recognized them as full-fledged languages later
(whatever *that* means). I suspect that sl-rozaj will go that way as well.
RFC3066 is really not in a position, IMO, to be making judgements about what
ISO639 will or won't do, and the whole nature of base language tags is such
that I don't think we can codify it well enough to have a certain rigor to

We did leave open the ability to register a 'base language' tag. Combine
that with powerful mechanisms for describing quite granular sub-languages
and we should be able (within the limits of the abstraction) to describe
most text pretty accurately. It seems to me that the nature of ISO639 is
such that a 'sub-language' that comes before script and geographic location
rises to the level of a language tag in ISO639. Why are Chinese dialects

RFC3066 should have a firmer policy, I believe, of not registering languages
until ISO639 has positively made a determination. If 'zh' is Mandarin, then
'xiang' should get its own ISO639. If 'zh' is generic Chinese (whatever that
is), then 'xiang' may still warrant an ISO639 tag, and failing that should
be considered for a base language registration. Do you think that 'xiang' or
some other dialect forms a case that goes outside of this? What would be the
criteria for registering a subtag like that?

Best Regards,


Addison P. Phillips
Director, Globalization Architecture
webMethods | Delivering Global Business Visibility
Chair, W3C Internationalization (I18N) Working Group
Chair, W3C-I18N-WG, Web Services Task Force

Internationalization is an architecture.
It is not a feature.

> -----Original Message-----
> From: John Cowan [mailto:cowan at]
> Sent: Saturday, November 22, 2003 4:41 PM
> To: Addison Phillips [wM]
> Cc: ietf-languages at
> Subject: Re: Proposed Successor to RFC 3066 (language tags)
> Addison Phillips [wM] scripsit:
> > zh-xiang
> > zh-CN-xiang
> > zh-xiang-2003
> > zh-xiang-750BCE  # this date is random.
> > zh-CN-xiang-2003
> > zh-Hant-CN-2003-xiang
> > zh-x-dialect=xiang
> > zh-Hant-CN-xiang-scouse-2003-boont
> If "zh" means Mandarin, then Xiang is not a variant of zh.  If zh means
> Chinese generically, which is the RFC 3066 assumption, then there is a
> case for registered sub-language tags, allowing zh-yue-CN vs. zh-yue-US,
> for example.
> --
> Overhead, without any fuss, the stars were going out.
>         --Arthur C. Clarke, "The Nine Billion Names of God"
>                 John Cowan <jcowan at>

More information about the Ietf-languages mailing list