Structured documents and absence of language information

virach@nectec.or.th virach@nectec.or.th
Mon, 22 Apr 2002 13:56:31 +0700


The first choice is preferable since "und" does not provide any
information. It also causes difficulty to other cases as Martin has
mentioned.

Virach

At Mon, 22 Apr 2002 11:00:15 +0900,
Martin Duerst <duerst@w3.org> wrote:
> 
> We have so far mainly looked at two choices. The first one, and
> the one preferred in the W3C I18N WG/IG, is to use the empty string:
> 
>     xml:lang=""
> 
> The advantage of this is that it's implicitly evident, and it's
> the same as other, similar attributes. This use would have to
> be defined in the XML specification (either in a new version or
> by an erratum).
> 
> The second possibility we have considered is to use the language
> code "und" (Undetermined). One question here is whether it's
> okay to use that for things that are not in any natural language
> at all (e.g. pure numeric data, programs, mathematics,...).
> As far as I know, the ISO 632 standards don't apply to such things.
> The other problem is that we would need to change XML to say
> that e.g.
> 
> <?xml version='...' ?>
> <root xml:lang='und'>
> ...
> </root>
> 
> is the same as:
> 
> <?xml version='...' ?>
> <root>
> ...
> </root>