Proposal to remove Preferred-Value field for region YU in LTRU

Mark Davis mark at macchiato.com
Fri Feb 27 22:11:38 CET 2009


Ah, I am now finally understanding what you are concerned about. The main
problem is that the Preferred value really should be a set, in the case of
regions. Then we would have

Before:
YU -> CS

After
YU -> {RS ME}
CS -> {RS ME}

and the connection is maintained. But we -- unfortunately -- don't have that
ability, and I'm not suggesting addition at this late date (although perhaps
for a future version - in CLDR we maintain that information because it is
important for implemenations)!

So removing CS breaks the equivalence class relation between YU and CS.

I'm starting to change my mind about the wisdom of removing the Preferred
value. After all the purpose is for canonicalization, and xx-YU and xx-CS
should have the same canonical form. We lose that if we drop the value.

Mark


On Fri, Feb 27, 2009 at 12:55, Tex Texin <textexin at xencraft.com> wrote:

> 1) I used YU/CS as a shorthand for identifying a subtag that could be
> either.
> 2) I understand the inaccuracy between YU and CS. That was not offered as
> the reason for the change however, at least in the mails I saw. Perhaps it
> was an implicit motive.
>
> 3) I understand that there isn't a requirement to change tags. I'll make
> the case another way-
> At some point in time a user attempts to find documents tagged for
> Yugoslavia.
> The search engine, using the then current registry data noting the
> preferred value relationship, matches either YU and CS.
>
> Another user searches for documents for Serbia.
> The search engine, using the then current registry data noting the
> preferred value relationship, matches either YU and CS.
>
> The results are in some sense accurate and complete given the history of
> the subtag.
>
> After the change in the preferred value relationship, the search engine
> does not search for both, since the registry does not indicate a
> relationship. Only one or the other subtag is used for each query. However,
> the query results are now incomplete since documents for YU may have been
> tagged with the one-time preferred tag of CS.
>
> 4) Comments are a good thing for recording rationale and tangential
> history. However, implementers are not going to go thru and read the
> comments on any or all tags in order to make a correct implementation. They
> are going to implement based on the schema and operate with the data values.
>
> 5) I think the registry should stay as it is with respect to YU and CS.
> As CS is now being used, deprecated or not, I don't see a compelling
> motivation to change the value back to YU. Doing so would just compound the
> confusion over the two subtags.
>
> 6) I don't expect users to be walking the registry in any event but to use
> a software package that recommends the optimal value. If that software
> executes a few extra machine cycles to get to CS, so be it. (And that is
> only if the results aren't put into a precompiled form.)
>
> 7) I would not argue that preferred value relationships should never
> change. But the motivation to make a change should be compelling enough to
> outweigh the impact of making ambiguous the existing tagged data.
>
> 8) Separate topic- The number of countries in the world seems to grow. This
> suggests to me that regions being subdivided is not going to be a rare
> event. Perhaps there should be a mechanism to indicate subtags that have
> later been split, so instead of one preferred value, there is a way to
> indicate that a tag has been deprecated in favor of two or more possible
> values.
>
> tex
>
>
> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no [mailto:
> ietf-languages-bounces at alvestrand.no] On Behalf Of Doug Ewell
> Sent: Friday, February 27, 2009 5:21 AM
> To: ietf-languages at iana.org
> Subject: Re: Proposal to remove Preferred-Value field for region YU in LTRU
>
> Tex Texin <textexin at xencraft dot com> wrote:
>
> > Historically, it was a concern that codes might change.
> > If I use the registry to choose the preferred value for a region, and
> > that preferred value can change, then isn't it tantamount to the code
> > changing?
>
> This would have been a good question for the LTRU group back when the
> decision was made to allow Preferred-Value to change.  I'm guessing this
> was about a year ago, but I would have to look it up.
>
> > If I had data that would be represented by YU/CS and after the
> > preferred value is removed it should instead be YU, that seems like a
> > problem.
>
> I guess I'm not sure what you mean by "YU/CS" in this context.  A
> language tag contains at most one region subtag, of course.
>
> > Especially since the relationship between CS and YU becomes lost.
>
> Speaking to this particular case and not to the general principle of
> allowing P-V to change...
>
> It has been argued frequently on LTRU that the relationship between CS
> and YU is not what it appears, because the country identified as YU
> changed its nature dramatically between 1991 and 2003, in a way that was
> pertinent to language identification, by shrinking from the original
> "Yugoslavia" to just Serbia and Montenegro.  This viewpoint holds that
> data tagged as "something-YU" is already ambiguous as to "which YU" is
> intended.  This is really just a special case of the problem that
> country codes as language modifiers are less than perfectly precise.
>
> > Also, it may not be clear which CS records should be restored to YU.
>
> There is never any presumption that someone will go through and retag
> data.  Section 3.1 says, "In particular, the 'Preferred-Value' field
> does not imply retagging content that uses the affected subtag."  To me
> this implies that a change or deletion of P-V doesn't imply retagging
> either.
>
> > I don't see that the fact that the target preferred value of YU is
> > also deprecated is a good reason to break the relationship at this
> > point. We still end up with deprecated codes with no preferred value
> > to go to, so why introduce an unnecessary change?
>
> So that users will not have to follow a chain of arbitrary length to
> determine the best subtag -- or in this case, to reach a dead end.
>
> --
> Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14
> http://www.ewellic.org
> http://www1.ietf.org/html.charters/ltru-charter.html
> http://www.alvestrand.no/mailman/listinfo/ietf-languages  ˆ
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/ietf-languages/attachments/20090227/f8762b39/attachment-0001.htm 


More information about the Ietf-languages mailing list