The "not-language" identifier (was: RE: Mandarin Chinese,
Simplified Script)
Caoimhin O Donnaile
caoimhin at smo.uhi.ac.uk
Thu Jun 16 19:49:08 CEST 2005
> <tag xml:lang="en">
> <tag xml:lang="fr">
> <tag xml:lang=""> (1)
> <tag xml:lang="xnl"> (2)
>
> (1) appears to say that there is a language, but we're not telling.
> (2) would suggest that there is no language to be had
(2) would mean what you say all right.
But according to section 2.12 http://www.w3.org/TR/REC-xml/ at least,
(1) doesn't mean that there is or there isn't a language. It just
unsets xml:lang
"Within [the element] it is considered that there is no language
information available, just as if xml:lang had not been
specified on [the element] or any of its ancestors."
Anyone know whether elements which are tagged as
<tag xml:lang="">
are actually stored internally by XML processing software in an
identical fashion to elements which are tagged simply as
<tag>
in the absence of any inherited xml:lang value?
i.e. Is xml:lang="" actually processed as an "unset" command?
And talking about sets, is the likes of:
xml:lang=en,gd
allowed? - For example, to tag a film as having mixed Gaelic and English
dialogue. Or for a document containing mixed Gaelic and English, to say
"Allow both Gaelic and English in spell-checking" without the chore of
labelling every word for language. (It looks from
http://www.x3.org/TR/REC-xml/ as if it isn't allowed.)
Forgive my ignorance. I am new to XML.
Caoimhín Ó Donnaíle
More information about the Ietf-languages
mailing list