no linguistic content tag (was RE: Mandarin Chinese, Simplified Script)

John.Cowan jcowan at
Wed Jun 15 22:08:01 CEST 2005

Addison Phillips scripsit:

> I'm not sure I like the idea. The empty tag, while not quite the
> same level of declaration, implies this well-enough. We already have
> troublesome codes like MUL and UND. A "NOT" code would represent Yet
> Another Special Code. I like the empty tag much more for a situation
> like this.

The empty string is not actually a valid tag in RFC 3066.  It is a legal
value for xml:lang, but with different semantics: it means that no declaration
is being made about the language of an element.  So we have three related
things here:

xml:lang="": may or may not be linguistic content
xml:lang="und": definitely linguistic content, language unknown
xml:lang="xnl" (proposed): non-linguistic content

> "Information items" that contain natural language generally
> should be separate from non-language bearing items.

See my previous posting.  I think the case of instrumental-music recordings
is particularly compelling.

Overhead, without any fuss, the stars were going out.
        --Arthur C. Clarke, "The Nine Billion Names of God"
                John Cowan <jcowan at>

More information about the Ietf-languages mailing list