Language Registrations needed for i-unknown and i-mixed

Francois Yergeau FYergeau at alis.com
Wed Jan 21 05:10:55 CET 2004


Bob Wyman wrote:
>Our dilemma is that RSS appears to have been defined with the
>assumption that all items in a feed would share a common language.

The spec even says so. Bad.

>This is a usually good assumption when RSS is being used
>to syndicate the content of a blog being maintained by a
>single person,

Not even.  I know a blog in 4 languages.

>Unfortunately, RSS V2.0 -- like many other protocols --
>doesn't define item-level <language> tags...

The XML spec does.  Put an xml:lang attribute on the <item> element.
xml:lang is defined by the XML spec itself, pretty standard, no?

>Now, clearly, we could define some new namespace
>and create an item-level <language> tag of our own like
>"<ps:language>". The difficulty with doing so is that 
>this private tag wouldn't achieve much more than wasting 
>bandwidth since no known news aggregator knows what to do
>with it.

Your problem will not really be solved without tagging the stuff, so you'd
better get started and get aggregators to pick it up.  Use the standard
xml:lang, though.

>Our interface allows people to create subscriptions that
>restrict the content that is scanned for them to only those
>that are marked as being in some specific language.

So you need tagging.  Go ahead.

>In order to address the issue of "any language" subscriptions,
>etc., I'm requesting that we be able to use "i-unknown" and/or
>"i-mixed" when appropriate.

"i-mixed" already exists, under another name: "mul".  That's specified as
multiple languages by ISO 639-2, and therefore allowed by RFC 3066, and
therefore usable in xml:lang.

For "i-unknown" you have two choices:

- remain silent (don't emit <language>).  There doesn't seem to be much
practical difference between not saying anything and saying you don't know.

- use "und", which means undetermined.

>Alternative solutions would be welcomed.

There you are :-)

-- 
François Yergeau


More information about the Ietf-languages mailing list