draft-phillips-langtags-08, process, sp ecifications, "stability", and extensions

Thu Jan 6 19:48:41 CET 2005

I notice two main types of arguments going on in this thread, where it seems to me that there is frustration
and "talking past each other" occurring due to fundamentally different concerns and assumptions between
different constituencies. 

One type of conflict seems to me between what I will term, for convenience (and please, I don't want to get
side-tracked on my choice of terms -- I just want convenient words) "implementors" vs. "linguists".

By "implementors", I mean those whose concern is primarily on how to interpret (act on) received language tags
-- consumers of language tags, where falling back to a "general" or "compatible" match may be desirable when an
exact match is not available.  From their point of view, the most important aspect of language tags is being able to
parse and match them -- exact linguistic purity and accuracy is a secondary issue.  From their point of view, the 
addition of new tags, regardless of whether the new tags improve language tagging "accuracy", may be actively
harmful unless accompanied by improved matching rules.  To the extent that the adding of tags moves beyond
simple registration of new tags, and instead into new forms of tags and new rules for interpreting tags, that is, that
the new tags bring up fundamental matching algorithm questions, that becomes the main concern for this group.

There are what I will refer to as "linguistic purists", whose concern is primarily on having precise, accurate tags
availabel for languages.  (These may be people whose orientation is on generating content, and labelling it 
accurately.)  For this group, the most important aspect of language tags is having them be accurate and precise.
Any matching issue (and in particular issues of trying to fall back to a more "generic" match when an exact match
is not available) are secondary.

The opinion on whether a tag is "useful" then varies: "it's useful if I know how to match it" vs. "it's useful if it's accurate".

An example where the difference in orientation shows up is with the position of script vs. country in tags.  From the
linguistic point of view, there are arguments for having script come first.  But from the implementation point of view,
that is less backwards-compatible with 3066, hence more problematic.

The process question of whether this is appropriately a BCP, or whether it is at least implicitly  bringing up
algorithmic implementation issues and hence instead ought to be perhaps a Proposed Standard or an Experimental 
Standard, also has something to do with this difference in orientation.

A second type of argument, (which I should mention I have largely tuned out so this is my superficial and not very
informed take on it), seems to me to be more linguistic/political in nature, which is what is the "correct" (linguistically 
correct? politically correct?) way to name the tags: what sort of naming scheme corresponds to linguistic reality,
or what sort of naming scheme is politically acceptable, and is there a conflict there.  This does get back to the
algorithmic matching issue in a sense though, which is that if one wants some sort of hierarchical structure to
the tags (to allow easier matching), or indeed define any sort of matching rules (as an implementor wants), you're
probably getting right into some political questions about how matching "should work".   So for those who wanted
to stick just to linguistic accuracy and try to avoid political issues, trying to avoid discussion of algorithmic matching
may have seemed appealing (but then provides no help to what I've termed the "implementors").

If we can keep in mind that there are different constituencies interested in language tags, with different main concerns,
then I would hope for less frustration and irritation with others "missing the main point", so that constructive 
discussions can occur, leading to some compromise useful to everyone. 

Regards,

Kristin