Proposal to remove Preferred-Value field for region YU in LTRU

Tex Texin textexin at xencraft.com
Fri Feb 27 21:55:22 CET 2009


1) I used YU/CS as a shorthand for identifying a subtag that could be either.
2) I understand the inaccuracy between YU and CS. That was not offered as the reason for the change however, at least in the mails I saw. Perhaps it was an implicit motive.

3) I understand that there isn't a requirement to change tags. I'll make the case another way-
At some point in time a user attempts to find documents tagged for Yugoslavia.
The search engine, using the then current registry data noting the preferred value relationship, matches either YU and CS.

Another user searches for documents for Serbia.
The search engine, using the then current registry data noting the preferred value relationship, matches either YU and CS.

The results are in some sense accurate and complete given the history of the subtag.

After the change in the preferred value relationship, the search engine does not search for both, since the registry does not indicate a relationship. Only one or the other subtag is used for each query. However, the query results are now incomplete since documents for YU may have been tagged with the one-time preferred tag of CS.

4) Comments are a good thing for recording rationale and tangential history. However, implementers are not going to go thru and read the comments on any or all tags in order to make a correct implementation. They are going to implement based on the schema and operate with the data values.

5) I think the registry should stay as it is with respect to YU and CS. 
As CS is now being used, deprecated or not, I don't see a compelling motivation to change the value back to YU. Doing so would just compound the confusion over the two subtags.

6) I don't expect users to be walking the registry in any event but to use a software package that recommends the optimal value. If that software executes a few extra machine cycles to get to CS, so be it. (And that is only if the results aren't put into a precompiled form.)

7) I would not argue that preferred value relationships should never change. But the motivation to make a change should be compelling enough to outweigh the impact of making ambiguous the existing tagged data.

8) Separate topic- The number of countries in the world seems to grow. This suggests to me that regions being subdivided is not going to be a rare event. Perhaps there should be a mechanism to indicate subtags that have later been split, so instead of one preferred value, there is a way to indicate that a tag has been deprecated in favor of two or more possible values.

tex


-----Original Message-----
From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of Doug Ewell
Sent: Friday, February 27, 2009 5:21 AM
To: ietf-languages at iana.org
Subject: Re: Proposal to remove Preferred-Value field for region YU in LTRU

Tex Texin <textexin at xencraft dot com> wrote:

> Historically, it was a concern that codes might change.
> If I use the registry to choose the preferred value for a region, and 
> that preferred value can change, then isn't it tantamount to the code 
> changing?

This would have been a good question for the LTRU group back when the 
decision was made to allow Preferred-Value to change.  I'm guessing this 
was about a year ago, but I would have to look it up.

> If I had data that would be represented by YU/CS and after the 
> preferred value is removed it should instead be YU, that seems like a 
> problem.

I guess I'm not sure what you mean by "YU/CS" in this context.  A 
language tag contains at most one region subtag, of course.

> Especially since the relationship between CS and YU becomes lost.

Speaking to this particular case and not to the general principle of 
allowing P-V to change...

It has been argued frequently on LTRU that the relationship between CS 
and YU is not what it appears, because the country identified as YU 
changed its nature dramatically between 1991 and 2003, in a way that was 
pertinent to language identification, by shrinking from the original 
"Yugoslavia" to just Serbia and Montenegro.  This viewpoint holds that 
data tagged as "something-YU" is already ambiguous as to "which YU" is 
intended.  This is really just a special case of the problem that 
country codes as language modifiers are less than perfectly precise.

> Also, it may not be clear which CS records should be restored to YU.

There is never any presumption that someone will go through and retag 
data.  Section 3.1 says, "In particular, the 'Preferred-Value' field 
does not imply retagging content that uses the affected subtag."  To me 
this implies that a change or deletion of P-V doesn't imply retagging 
either.

> I don't see that the fact that the target preferred value of YU is 
> also deprecated is a good reason to break the relationship at this 
> point. We still end up with deprecated codes with no preferred value 
> to go to, so why introduce an unnecessary change?

So that users will not have to follow a chain of arbitrary length to 
determine the best subtag -- or in this case, to reach a dead end.

--
Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14
http://www.ewellic.org
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages  ˆ

_______________________________________________
Ietf-languages mailing list
Ietf-languages at alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages



More information about the Ietf-languages mailing list