draft-phillips-langtags-08, process, sp ecifications, "stability", and extensions

Dave Singer singer at apple.com
Tue Jan 4 18:58:52 CET 2005


At 9:14 AM -0800 1/4/05, ned.freed at mrochek.com wrote:
>>This whole question of what 'matches' is subtle.  Consider the case
>>when I have a document that has variant content by language (e.g.
>>different sound tracks), and the user indicates a set of preferred
>>languages.  If the content has "de-CH" and "fr-CH" (swiss german and
>>french), and a default "en" (english) and the user says he speaks
>>"de-DE" and "fr-FR", on the face of it nothing matches, and I fall
>>back to the catch-all default, which is almost certainly not the best
>>result.
>
>David, this isn't the half of it. The case you describe is actually one of the
>easy ones, in that it can be handled by doing a "preferred" match on 
>the entire
>tag, with a "generic" match on the primary tag only having lesser precedence
>but higher precedence than a fallback to a default.

Yes, I picked off an easy example for which the 'matching' section of 
the draft didn't seem adequate.  This really is a tar-pit, of course. 
Serbo-croatian used to be a language;  now it's serbian and croatian. 
I assume that they are mutually intelligible.  Serbian is probably a 
better substitute for croatian than some general default (or 
silence), though saying this in some parts of the world might start 
wars.

The whole question of what is a language, a variant or dialect of a 
language, or a suitable substitute for a language, would benefit some 
thought in any tagging scheme, though I agree the problem is not 
generally soluble.

>
>I know of two other wrinkles in the RFC 1766 world:
>
>(1) Matching may want to take into account the distinguished nature
>    of country subtags in some way.
>
>(2) SGN- requires special handling, in that SGN-FR and SGN-EN are in fact
>    sufficiently different languages that a primary tag match should not be
>    taken to be a generic match. (Of course this only matters if sign
>    languages are relevant to your situation - in many cases they aren't.
>    In retrospect I think it was a mistake to register sign languages this
>    way.)
>
>This proposed revision, however, opens pandora's box in regards to matching.
>Consider:
>
>(a) Extension tags appear as the first subtags, and as such have to
>    be taken into account when looking for country subtags.
>
>(b) Script tags change the complexion of the matching problem significantly,
>    in that they can interact with external factors like charset information
>    in odd ways.
>
>(c) UN country numbers have been added (IMO for no good reason), requiring
>    handling similar to country codes.
>
>The bottom line is that while I know how to write reasonable code to do RFC
>1766 matching (and have in fact done so for widely deployed software), I
>haven't a clue how to handle this new draft competently in regards 
>to matching.
>And the immediate consequence of this is that I, and I suspect many other,
>implementors are going to adopt a "wait and see" attitude in regards to
>implementing any of this.
>
>				Ned


-- 
David Singer
Apple Computer/QuickTime


More information about the Ietf-languages mailing list