RFC 3066bis: extensions

Addison Phillips [wM] aphillips at webmethods.com
Sat Mar 6 12:50:57 CET 2004

Dear Peter,

We have several cases (not just one) for extensions. But before I leap into
them, you should really read draft-01, since we made alterations referred to
by Mark in his message, in part to accommodate your comments.

There are a number of reasons that an extension mechanism is useful. Let's
start with the negative example. You say that there are many ways that
mechanisms for exchanging meta data about language material could be created
that do not involve modifying rfc3066.

I agree that this is true, but the problem is one of practicality. Having a
single, standardized field (like xml:lang) provides a mechanism that many
applications can use and rely on the presence of. Creating new fields to
communicate additional information would require significant work to achieve
the same result. The mere existence of xml:lang as an attribute means that
diverse standards such as RDF, XHTML, SVG, SSML, SOAP, XSL, CSS, etc. etc.
can be used in a harmonious way (in fact, transformations and searches can
be applied across many such technologies). There are plenty of cases where
rfc3066 (even with the other 3066:bis extensions) cannot meet a particular
applications complete needs, but where compatibility across multiple
technologies is desirable.

The question is whether there are communities that need to identify and
label content (or request/negotiate content) whose needs are not met by the
combination of the generative mechanism and the registration mechanism. I
think that the answer is "yes". One such case is the need to label text more
precisely than the ISO639-x + country code division can render. Dialects,
regional variations, accents and the like often vary on a sub-national level
(think of regional variation in English in the USA or in the UK). Creating
an enormous database of registered 'variants' is one solution, but not one
that is particularly enticing.

Here's a potential use: speech synthesis. It might be useful to tag text
marked up in SSML with "hints" as to the regional variation so that the
synthesizer could choose an appropriately accented voice. 'en-GB-xgeordie'
is one way to handle that, sure. But it might be more useful to have
'en-GB-x-geordie', separating a sequence of private tags from the
standardized ones... because this IS a (quasi-)legal tag:
'en-GB-xgeordie-oxford' [assuming that the current en-gb-oed is regularized
that way]. The -x- subtags basically announces the end of public/registered
values, allowing parsers to either start ignoring (becuase no extensions are
supported) or to improve efficiency (because you need not check for
registered values any longer).

Best Regards,


Addison P. Phillips
Director, Globalization Architecture
webMethods | Delivering Global Business Visibility
Chair, W3C Internationalization (I18N) Working Group
Chair, W3C-I18N-WG, Web Services Task Force

Internationalization is an architecture.
It is not a feature.

> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no
> [mailto:ietf-languages-bounces at alvestrand.no]On Behalf Of Peter
> Constable
> Sent: vendredi 5 mars 2004 20:51
> To: ietf-languages at alvestrand.no
> Subject: FW: RFC 3066bis: extensions
> I'm resending this to (hopefully) eliminate any doubt as to which of my
> messages from Jan. 12 (all three of them) I was referring to.
> Peter
> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no
> [mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of Peter
> Constable
> Sent: Monday, January 12, 2004 1:53 PM
> To: ietf-languages at alvestrand.no
> Subject: RFC 3066bis: extensions
> I'd like to raise a question regarding the value of the extensions
> mechanism in the proposed successor to RFC 3066.
> Consider a tag such as "az-Arab-x-SIL.AZE". The same semantics could be
> contained as two metadata attributes, "az-Arab" in one and "SIL.AZE" (or
> some equivalent) in the other.
> The tag "az-x-SIL.AZE" would be applied to some content or resource, or
> passed through some interface to request or reference certain content or
> resources, following some higher-level protocol.
> If the higher-level protocol is proprietary, then anything could be
> done, including using proprietary extensions to the language tag spec to
> arrive at tags that would not conform to the spec, such as tags that
> used otherwise invalid characters like "/", that allowed tags beginning
> with "u=", etc.. In this situation, it is not a concern if there are
> portions of a language tag -- the extension -- that have private
> semantics. Note, though, that one would be free to define that protocol
> instead using additional attributes / parameters and use RFC3066bis
> langids that do not include extensions.
> In the case of an open-standard protocol, it seems to me that
> privately-defined extensions would serve no purpose. The only possible
> exception would be if one was referring to a public standard of a very
> general nature, such as XML, serving as a platform on which some
> application is built. But in that application, separate
> attributes/parameters could rather be used, again allowing an RFC3066bis
> that does not include extensions.
> It just seems to me that anything that could be done using extensions
> could very easily be done without such a radical innovation. At the same
> time, adding the extension mechanism involves some costs and back-compat
> issues.
> I'd like to hear from Addison and Mark on the case for adding the
> extension mechanism. I also wonder what opinions others have.
> Peter
> Peter Constable
> Globalization Infrastructure and Font Technologies
> Microsoft Windows Division
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages

More information about the Ietf-languages mailing list