Tagging videoclips in XML (RE: Swiss german, spoken)

Addison Phillips addison.phillips at quest.com
Wed Jun 15 17:10:14 CEST 2005

The purpose of xml:lang is as Harold describes it: indicating the language contained by an element. That is, the best application of xml:lang should be to identify the language of text contained by the element (including any sub-elements).

If you want to have an element or attribute whose value is a language, then you should use RFC 3066 (or its successor) to form the value, but you should define an element or attribute of your own with a different name (and not use xml:lang).

For example, in XHTML, there is an hreflang attribute in the <a> element and also an xml:lang (or lang attribute, in the case of HTML) for the content of the <a> element:

<a xml:lang="en" href="xyz" hreflang="de">Click for German</a>

Using Harold's example, your XML might look like:

  <title xml:lang="en">Casablanca</title>
  <runningTime value="137" /> <!-- not language affected -->
  <dialogue language="zh-min" />
  <subtitles track="1" language="zh-Hant" />
  <subtitles track="2" language="zh-Hans" />

In addition, while it is possible to define your own formats for all the various values that you need, it is sometimes helps interoperability to define formats using a shared vocabulary, such as XML Schema. XML Schema provides a type for language values (xsi:language) which is defined using RFC 3066. 

Best Regards,


Addison P. Phillips
Globalization Architect, Quest Software
Chair, W3C Internationalization Core Working Group

Internationalization is not a feature.
It is an architecture. 

> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-
> bounces at alvestrand.no] On Behalf Of Harald Tveit Alvestrand
> Sent: 2005?6?14? 19:34
> To: Karen_Broome at spe.sony.com
> Cc: 'IETF Languages Discussion'; 'Michael Everson'; Debbie Garside
> Subject: Tagging videoclips in XML (RE: Swiss german, spoken)
> Since we're getting far off topic, I'm changing the subject line, and
> probably should stop the thread pretty soon....
> --On 14. juni 2005 16:48 -0700 Karen_Broome at spe.sony.com wrote:
> > Well-formed XML uses the xml:lang attribute to specify the content
> > language, not a <language> tag, though that would work in an informal
> > use.  I was speaking of the formal use for which RFC 3066 is prescribed.
> well... RFC 1766 actually predates XML, and even RFC 3066 is older than
> XML's recent popularity.....
> I believe xml:lang is an attribute that specifies the language of the
> content of the thing it's an attribute of; in the recommendation's words:
> > In document processing, it is often useful to identify the natural or
> > formal language in which the content is written. A special attribute
> > named xml:lang MAY be inserted in documents to specify the language used
> > in the contents and attribute values of any element in an XML document.
> > In valid documents, this attribute, like any other, MUST be declared if
> > it is used. The values of the attribute are language identifiers as
> > defined by [IETF RFC 3066], Tags for the Identification of Languages, or
> > its successor; in addition, the empty string MAY be specified.
> (from <http://www.w3.org/TR/REC-xml/> section 2.12)
> I believe your application as described doesn't have the XML tagging
> surrounding the videoclip in question, so formally, using the xml:lang
> isn't "right" - using another attribute or element content with values
> from
> RFC 3066 is perfectly OK, though. Or it might be "right" to use the
> xml:lang attribute if the content (or another attribute) of the element is
> the URL of the videoclip; that one I'm even less sure about.
> But - I'm not an official interpreter of the XML canon, so don't attach
> too
> much weight to my words.... and of course, anything you really want to
> make
> work can be made to work, whether or not it's created in accordance with
> the XML canon....
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages

More information about the Ietf-languages mailing list