The "not-language" identifier (was: RE: Mandarin Chinese, Simplified Script)

Caoimhin O Donnaile caoimhin at
Thu Jun 16 19:49:08 CEST 2005

> <tag xml:lang="en">
> 	<tag xml:lang="fr">
> 		<tag xml:lang=""> (1)
> 		<tag xml:lang="xnl"> (2)
> (1) appears to say that there is a language, but we're not telling.
> (2) would suggest that there is no language to be had

(2) would mean what you say all right.

But according to section 2.12 at least, 
(1) doesn't mean that there is or there isn't a language.  It just 
unsets xml:lang

   "Within [the element] it is considered that there is no language
    information available, just as if xml:lang had not been
    specified on [the element] or any of its ancestors."

Anyone know whether elements which are tagged as
     <tag xml:lang="">
are actually stored internally by XML processing software in an 
identical fashion to elements which are tagged simply as
in the absence of any inherited xml:lang value?
i.e. Is xml:lang="" actually processed as an "unset" command?

And talking about sets, is the likes of:
allowed? - For example, to tag a film as having mixed Gaelic and English 
dialogue.  Or for a document containing mixed Gaelic and English, to say 
"Allow both Gaelic and English in spell-checking" without the chore of 
labelling every word for language.  (It looks from as if it isn't allowed.)

Forgive my ignorance.  I am new to XML.

Caoimhín Ó Donnaíle

More information about the Ietf-languages mailing list