Proposal to remove Preferred-Value field for region YU in LTRU

Phillips, Addison addison at amazon.com
Fri Feb 27 22:56:30 CET 2009


> 8) Separate topic- The number of countries in the world seems to
> grow. This suggests to me that regions being subdivided is not
> going to be a rare event. Perhaps there should be a mechanism to
> indicate subtags that have later been split, so instead of one
> preferred value, there is a way to indicate that a tag has been
> deprecated in favor of two or more possible values.

The reason we don't provide this is due the purpose of P-V in 4646 and 4646bis: the P-V is used in the canonicalization rules in Section 4.5, which require that P-V mappings be unambiguous... that is, that there is only ever one per record. In cases where two or more possible values replace the P-V, canonicalization of the tag cannot proceed without human intervention (unless we're going to invents rules that result in a preferred preferred-value). The majority of cases extant actually fit the P-V model nicely. Cases, such as this one, in which a country has broken up do not fit automatic tag normalization schemes because not enough information is available in the original tag/subtag to determine what to do. In such a case, comments and other information can help human users to determine what they want to do. And we still have 'Deprecated' to make clear that a subtag should no longer be assigned.

It's also past the point in which we can meaningfully change 4646bis in LTRU and I fervently hope we do not need a 4646ter any time soon.

Addison

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.


> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-
> bounces at alvestrand.no] On Behalf Of Tex Texin
> Sent: Friday, February 27, 2009 12:55 PM
> To: 'Doug Ewell'; ietf-languages at iana.org
> Subject: RE: Proposal to remove Preferred-Value field for region YU
> in LTRU
> 
> 1) I used YU/CS as a shorthand for identifying a subtag that could
> be either.
> 2) I understand the inaccuracy between YU and CS. That was not
> offered as the reason for the change however, at least in the mails
> I saw. Perhaps it was an implicit motive.
> 
> 3) I understand that there isn't a requirement to change tags. I'll
> make the case another way-
> At some point in time a user attempts to find documents tagged for
> Yugoslavia.
> The search engine, using the then current registry data noting the
> preferred value relationship, matches either YU and CS.
> 
> Another user searches for documents for Serbia.
> The search engine, using the then current registry data noting the
> preferred value relationship, matches either YU and CS.
> 
> The results are in some sense accurate and complete given the
> history of the subtag.
> 
> After the change in the preferred value relationship, the search
> engine does not search for both, since the registry does not
> indicate a relationship. Only one or the other subtag is used for
> each query. However, the query results are now incomplete since
> documents for YU may have been tagged with the one-time preferred
> tag of CS.
> 
> 4) Comments are a good thing for recording rationale and tangential
> history. However, implementers are not going to go thru and read
> the comments on any or all tags in order to make a correct
> implementation. They are going to implement based on the schema and
> operate with the data values.
> 
> 5) I think the registry should stay as it is with respect to YU and
> CS.
> As CS is now being used, deprecated or not, I don't see a
> compelling motivation to change the value back to YU. Doing so
> would just compound the confusion over the two subtags.
> 
> 6) I don't expect users to be walking the registry in any event but
> to use a software package that recommends the optimal value. If
> that software executes a few extra machine cycles to get to CS, so
> be it. (And that is only if the results aren't put into a
> precompiled form.)
> 
> 7) I would not argue that preferred value relationships should
> never change. But the motivation to make a change should be
> compelling enough to outweigh the impact of making ambiguous the
> existing tagged data.
> 
> 8) Separate topic- The number of countries in the world seems to
> grow. This suggests to me that regions being subdivided is not
> going to be a rare event. Perhaps there should be a mechanism to
> indicate subtags that have later been split, so instead of one
> preferred value, there is a way to indicate that a tag has been
> deprecated in favor of two or more possible values.
> 
> tex
> 
> 
> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-
> bounces at alvestrand.no] On Behalf Of Doug Ewell
> Sent: Friday, February 27, 2009 5:21 AM
> To: ietf-languages at iana.org
> Subject: Re: Proposal to remove Preferred-Value field for region YU
> in LTRU
> 
> Tex Texin <textexin at xencraft dot com> wrote:
> 
> > Historically, it was a concern that codes might change.
> > If I use the registry to choose the preferred value for a region,
> and
> > that preferred value can change, then isn't it tantamount to the
> code
> > changing?
> 
> This would have been a good question for the LTRU group back when
> the
> decision was made to allow Preferred-Value to change.  I'm guessing
> this
> was about a year ago, but I would have to look it up.
> 
> > If I had data that would be represented by YU/CS and after the
> > preferred value is removed it should instead be YU, that seems
> like a
> > problem.
> 
> I guess I'm not sure what you mean by "YU/CS" in this context.  A
> language tag contains at most one region subtag, of course.
> 
> > Especially since the relationship between CS and YU becomes lost.
> 
> Speaking to this particular case and not to the general principle
> of
> allowing P-V to change...
> 
> It has been argued frequently on LTRU that the relationship between
> CS
> and YU is not what it appears, because the country identified as YU
> changed its nature dramatically between 1991 and 2003, in a way
> that was
> pertinent to language identification, by shrinking from the
> original
> "Yugoslavia" to just Serbia and Montenegro.  This viewpoint holds
> that
> data tagged as "something-YU" is already ambiguous as to "which YU"
> is
> intended.  This is really just a special case of the problem that
> country codes as language modifiers are less than perfectly precise.
> 
> > Also, it may not be clear which CS records should be restored to
> YU.
> 
> There is never any presumption that someone will go through and
> retag
> data.  Section 3.1 says, "In particular, the 'Preferred-Value'
> field
> does not imply retagging content that uses the affected subtag."
> To me
> this implies that a change or deletion of P-V doesn't imply
> retagging
> either.
> 
> > I don't see that the fact that the target preferred value of YU
> is
> > also deprecated is a good reason to break the relationship at
> this
> > point. We still end up with deprecated codes with no preferred
> value
> > to go to, so why introduce an unnecessary change?
> 
> So that users will not have to follow a chain of arbitrary length
> to
> determine the best subtag -- or in this case, to reach a dead end.
> 
> --
> Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14
> http://www.ewellic.org
> http://www1.ietf.org/html.charters/ltru-charter.html
> http://www.alvestrand.no/mailman/listinfo/ietf-languages  ˆ
> 
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
> 
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages


More information about the Ietf-languages mailing list