Proposal to remove Preferred-Value field for region YU in LTRU
Phillips, Addison
addison at amazon.com
Fri Feb 27 22:56:30 CET 2009
> 8) Separate topic- The number of countries in the world seems to
> grow. This suggests to me that regions being subdivided is not
> going to be a rare event. Perhaps there should be a mechanism to
> indicate subtags that have later been split, so instead of one
> preferred value, there is a way to indicate that a tag has been
> deprecated in favor of two or more possible values.
The reason we don't provide this is due the purpose of P-V in 4646 and 4646bis: the P-V is used in the canonicalization rules in Section 4.5, which require that P-V mappings be unambiguous... that is, that there is only ever one per record. In cases where two or more possible values replace the P-V, canonicalization of the tag cannot proceed without human intervention (unless we're going to invents rules that result in a preferred preferred-value). The majority of cases extant actually fit the P-V model nicely. Cases, such as this one, in which a country has broken up do not fit automatic tag normalization schemes because not enough information is available in the original tag/subtag to determine what to do. In such a case, comments and other information can help human users to determine what they want to do. And we still have 'Deprecated' to make clear that a subtag should no longer be assigned.
It's also past the point in which we can meaningfully change 4646bis in LTRU and I fervently hope we do not need a 4646ter any time soon.
Addison
Addison Phillips
Globalization Architect -- Lab126
Internationalization is not a feature.
It is an architecture.
> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-
> bounces at alvestrand.no] On Behalf Of Tex Texin
> Sent: Friday, February 27, 2009 12:55 PM
> To: 'Doug Ewell'; ietf-languages at iana.org
> Subject: RE: Proposal to remove Preferred-Value field for region YU
> in LTRU
>
> 1) I used YU/CS as a shorthand for identifying a subtag that could
> be either.
> 2) I understand the inaccuracy between YU and CS. That was not
> offered as the reason for the change however, at least in the mails
> I saw. Perhaps it was an implicit motive.
>
> 3) I understand that there isn't a requirement to change tags. I'll
> make the case another way-
> At some point in time a user attempts to find documents tagged for
> Yugoslavia.
> The search engine, using the then current registry data noting the
> preferred value relationship, matches either YU and CS.
>
> Another user searches for documents for Serbia.
> The search engine, using the then current registry data noting the
> preferred value relationship, matches either YU and CS.
>
> The results are in some sense accurate and complete given the
> history of the subtag.
>
> After the change in the preferred value relationship, the search
> engine does not search for both, since the registry does not
> indicate a relationship. Only one or the other subtag is used for
> each query. However, the query results are now incomplete since
> documents for YU may have been tagged with the one-time preferred
> tag of CS.
>
> 4) Comments are a good thing for recording rationale and tangential
> history. However, implementers are not going to go thru and read
> the comments on any or all tags in order to make a correct
> implementation. They are going to implement based on the schema and
> operate with the data values.
>
> 5) I think the registry should stay as it is with respect to YU and
> CS.
> As CS is now being used, deprecated or not, I don't see a
> compelling motivation to change the value back to YU. Doing so
> would just compound the confusion over the two subtags.
>
> 6) I don't expect users to be walking the registry in any event but
> to use a software package that recommends the optimal value. If
> that software executes a few extra machine cycles to get to CS, so
> be it. (And that is only if the results aren't put into a
> precompiled form.)
>
> 7) I would not argue that preferred value relationships should
> never change. But the motivation to make a change should be
> compelling enough to outweigh the impact of making ambiguous the
> existing tagged data.
>
> 8) Separate topic- The number of countries in the world seems to
> grow. This suggests to me that regions being subdivided is not
> going to be a rare event. Perhaps there should be a mechanism to
> indicate subtags that have later been split, so instead of one
> preferred value, there is a way to indicate that a tag has been
> deprecated in favor of two or more possible values.
>
> tex
>
>
> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-
> bounces at alvestrand.no] On Behalf Of Doug Ewell
> Sent: Friday, February 27, 2009 5:21 AM
> To: ietf-languages at iana.org
> Subject: Re: Proposal to remove Preferred-Value field for region YU
> in LTRU
>
> Tex Texin <textexin at xencraft dot com> wrote:
>
> > Historically, it was a concern that codes might change.
> > If I use the registry to choose the preferred value for a region,
> and
> > that preferred value can change, then isn't it tantamount to the
> code
> > changing?
>
> This would have been a good question for the LTRU group back when
> the
> decision was made to allow Preferred-Value to change. I'm guessing
> this
> was about a year ago, but I would have to look it up.
>
> > If I had data that would be represented by YU/CS and after the
> > preferred value is removed it should instead be YU, that seems
> like a
> > problem.
>
> I guess I'm not sure what you mean by "YU/CS" in this context. A
> language tag contains at most one region subtag, of course.
>
> > Especially since the relationship between CS and YU becomes lost.
>
> Speaking to this particular case and not to the general principle
> of
> allowing P-V to change...
>
> It has been argued frequently on LTRU that the relationship between
> CS
> and YU is not what it appears, because the country identified as YU
> changed its nature dramatically between 1991 and 2003, in a way
> that was
> pertinent to language identification, by shrinking from the
> original
> "Yugoslavia" to just Serbia and Montenegro. This viewpoint holds
> that
> data tagged as "something-YU" is already ambiguous as to "which YU"
> is
> intended. This is really just a special case of the problem that
> country codes as language modifiers are less than perfectly precise.
>
> > Also, it may not be clear which CS records should be restored to
> YU.
>
> There is never any presumption that someone will go through and
> retag
> data. Section 3.1 says, "In particular, the 'Preferred-Value'
> field
> does not imply retagging content that uses the affected subtag."
> To me
> this implies that a change or deletion of P-V doesn't imply
> retagging
> either.
>
> > I don't see that the fact that the target preferred value of YU
> is
> > also deprecated is a good reason to break the relationship at
> this
> > point. We still end up with deprecated codes with no preferred
> value
> > to go to, so why introduce an unnecessary change?
>
> So that users will not have to follow a chain of arbitrary length
> to
> determine the best subtag -- or in this case, to reach a dead end.
>
> --
> Doug Ewell * Thornton, Colorado, USA * RFC 4645 * UTN #14
> http://www.ewellic.org
> http://www1.ietf.org/html.charters/ltru-charter.html
> http://www.alvestrand.no/mailman/listinfo/ietf-languages ˆ
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
More information about the Ietf-languages
mailing list