ID for language-invariant strings
mark.davis at icu-project.org
Thu Mar 13 15:25:03 CET 2008
We use (and I'd recommend) 'und' for the sense of 'language neutral' and
also in the sense of 'ultimate fallback' -- that is, if nothing better is
available. An example of a language-neutral form is ISO 8601 date format or
an ISO code for a region. The fallback role is important, since sometimes
there really isn't a widely-recognized language-neutral form for something,
so occasionally we have to fall back to a specific language form like
That's similar to the use of 'und' as a base for specifying just the script
or region (eg "und-Arab" meaning something/anything in Arabic script).
I don't think it makes sense to have yet another special tag; that will be
even more confusing.
On Wed, Mar 12, 2008 at 10:04 PM, Peter Constable <petercon at microsoft.com>
> What are people's thoughts about language tagging for language-invariant
> I'm working with a group on an application scenario in which we have a
> table of strings in various languages that name entities, but we also need
> to support entries that have a reference name that is considered a
> language-neutral form of a name for a given entity.
> Strictly speaking, these would be strings intended for programmatic
> operation and not human consumption, and so, some might argue, are not in
> scope for IETF language tags. However, these are exceptional: almost all of
> the strings in the same table are intended for human consumption, and RFC
> 4646 is the spec being applied for identifying the language of strings in
> the table. Moreover, the strings in question **are** (in general) in a
> human language; they are just the values that are adopted to be used for
> language-neutral referencing.
> The "i-default" tag is not appropriate for this: these are not default
> display strings. And "zxx" is not appropriate since, in fact, there is
> linguistic content. The "mis" tag doesn't seem like the right choice: we do
> not want an ambiguous ID that could be applied to other entries intended for
> a different purpose (strings intended for human consumption that happen to
> be in an uncoded language.) The "und" tag might be usable, though that
> doesn't seem quite right to me: we're not intending to say that the language
> is (as yet) undetermined (and might be determined later); rather, we want a
> value indicating 'this is special content used as a referential key – the
> language of the content is irrelevant'.
> That's leaving me a bit inclined to think that we should establish a
> special-purpose tag for the language-invariant semantic. I'd submit a
> request to the ISO 639 JAC (it would be another special-purpose-ID request
> as I did for "zxx".)
> What are others thoughts?
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Ietf-languages