ID for language-invariant strings

Mark Davis mark.davis at icu-project.org
Thu Mar 13 15:25:03 CET 2008


We use (and I'd recommend) 'und' for the sense of 'language neutral' and
also in the sense of 'ultimate fallback' -- that is, if nothing better is
available. An example of a language-neutral form is ISO 8601 date format or
an ISO code for a region. The fallback role is important, since sometimes
there really isn't a widely-recognized language-neutral form for something,
so occasionally we have to fall back to a specific language form like
English.

That's similar to the use of 'und' as a base for specifying just the script
or region (eg "und-Arab" meaning something/anything in Arabic script).

I don't think it makes sense to have yet another special tag; that will be
even more confusing.

Mark

On Wed, Mar 12, 2008 at 10:04 PM, Peter Constable <petercon at microsoft.com>
wrote:

>  What are people's thoughts about language tagging for language-invariant
> strings?
>
>
>
> I'm working with a group on an application scenario in which we have a
> table of strings in various languages that name entities, but we also need
> to support entries that have a reference name that is considered a
> language-neutral form of a name for a given entity.
>
>
>
> Strictly speaking, these would be strings intended for programmatic
> operation and not human consumption, and so, some might argue, are not in
> scope for IETF language tags. However, these are exceptional: almost all of
> the strings in the same table are intended for human consumption, and RFC
> 4646 is the spec being applied for identifying the language of strings in
> the table. Moreover, the strings in question **are** (in general) in a
> human language; they are just the values that are adopted to be used for
> language-neutral referencing.
>
>
>
> The "i-default" tag is not appropriate for this: these are not default
> display strings. And "zxx" is not appropriate since, in fact, there is
> linguistic content. The "mis" tag doesn't seem like the right choice: we do
> not want an ambiguous ID that could be applied to other entries intended for
> a different purpose (strings intended for human consumption that happen to
> be in an uncoded language.)  The "und" tag might be usable, though that
> doesn't seem quite right to me: we're not intending to say that the language
> is (as yet) undetermined (and might be determined later); rather, we want a
> value indicating 'this is special content used as a referential key – the
> language of the content is irrelevant'.
>
>
>
> That's leaving me a bit inclined to think that we should establish a
> special-purpose tag for the language-invariant semantic. I'd submit a
> request to the ISO 639 JAC (it would be another special-purpose-ID request
> as I did for "zxx".)
>
>
>
> What are others thoughts?
>
>
>
>
>
>
>
> Peter
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>
>


-- 
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/ietf-languages/attachments/20080313/6e51c666/attachment.html


More information about the Ietf-languages mailing list