Proposal to remove Preferred-Value field for region YU in LTRU

Mark Davis mark at macchiato.com
Sat Feb 28 18:49:31 CET 2009


I tend to agree with your reasoning.

Mark


On Sat, Feb 28, 2009 at 08:49, Peter Constable <petercon at microsoft.com>wrote:

>  There are two options involving the two records
>
>
>
> A)           YU -> CS
>
>                 CS
>
>
>
> Or
>
>
>
> B)            YU
>
>                 CS
>
>
>
> Some observations:
>
>
>
> (i) We all agree that both options result in paths that are dead ends.
>
>
>
> (ii) We all agree that the regions have split into multiple regions, that
> PV cannot be used to indicate multiple regions, and that LTRU should not
> change the 4646bis draft to accommodate data indicating a multi-way split.
>
>
>
> (iii) We all agree that users wondering how something tagged with YU or CS
> might be tagged under today’s recommendations, and that (optional) comments
> might be useful additions to the records for this purpose. And given (ii),
> comments are the only way to accomplish this.
>
>
>
> In light of those observations, option A is not any better than option B
> for users wondering how Balkan nations have changed and what the
> implications are for tagging: only comments or user research can answer
> that, and either can be applied to either option.
>
>
>
> However, the change from A to B does have an impact on canonicalization
> that can change the behaviour of implementations using it. There is no
> benefit to that behaviour change; it is likely detrimental.
>
>
>
> Hence, it seems the sensible choice is not to remove the PV field for YU,
> but to add comments (not in LTRU process) to the CS and YU records.
>
>
>
>
>
> Peter
>
>
>
> *From:* ietf-languages-bounces at alvestrand.no [mailto:
> ietf-languages-bounces at alvestrand.no] *On Behalf Of *Mark Davis
> *Sent:* Friday, February 27, 2009 1:12 PM
> *To:* Tex Texin
> *Cc:* ietf-languages at iana.org; Doug Ewell
>
> *Subject:* Re: Proposal to remove Preferred-Value field for region YU in
> LTRU
>
>
>
> Ah, I am now finally understanding what you are concerned about. The main
> problem is that the Preferred value really should be a set, in the case of
> regions. Then we would have
>
> Before:
> YU -> CS
>
> After
> YU -> {RS ME}
> CS -> {RS ME}
>
> and the connection is maintained. But we -- unfortunately -- don't have
> that ability, and I'm not suggesting addition at this late date (although
> perhaps for a future version - in CLDR we maintain that information because
> it is important for implemenations)!
>
> So removing CS breaks the equivalence class relation between YU and CS.
>
> I'm starting to change my mind about the wisdom of removing the Preferred
> value. After all the purpose is for canonicalization, and xx-YU and xx-CS
> should have the same canonical form. We lose that if we drop the value.
>
> Mark
>
>  On Fri, Feb 27, 2009 at 12:55, Tex Texin <textexin at xencraft.com> wrote:
>
> 1) I used YU/CS as a shorthand for identifying a subtag that could be
> either.
> 2) I understand the inaccuracy between YU and CS. That was not offered as
> the reason for the change however, at least in the mails I saw. Perhaps it
> was an implicit motive.
>
> 3) I understand that there isn't a requirement to change tags. I'll make
> the case another way-
> At some point in time a user attempts to find documents tagged for
> Yugoslavia.
> The search engine, using the then current registry data noting the
> preferred value relationship, matches either YU and CS.
>
> Another user searches for documents for Serbia.
> The search engine, using the then current registry data noting the
> preferred value relationship, matches either YU and CS.
>
> The results are in some sense accurate and complete given the history of
> the subtag.
>
> After the change in the preferred value relationship, the search engine
> does not search for both, since the registry does not indicate a
> relationship. Only one or the other subtag is used for each query. However,
> the query results are now incomplete since documents for YU may have been
> tagged with the one-time preferred tag of CS.
>
> 4) Comments are a good thing for recording rationale and tangential
> history. However, implementers are not going to go thru and read the
> comments on any or all tags in order to make a correct implementation. They
> are going to implement based on the schema and operate with the data values.
>
> 5) I think the registry should stay as it is with respect to YU and CS.
> As CS is now being used, deprecated or not, I don't see a compelling
> motivation to change the value back to YU. Doing so would just compound the
> confusion over the two subtags.
>
> 6) I don't expect users to be walking the registry in any event but to use
> a software package that recommends the optimal value. If that software
> executes a few extra machine cycles to get to CS, so be it. (And that is
> only if the results aren't put into a precompiled form.)
>
> 7) I would not argue that preferred value relationships should never
> change. But the motivation to make a change should be compelling enough to
> outweigh the impact of making ambiguous the existing tagged data.
>
> 8) Separate topic- The number of countries in the world seems to grow. This
> suggests to me that regions being subdivided is not going to be a rare
> event. Perhaps there should be a mechanism to indicate subtags that have
> later been split, so instead of one preferred value, there is a way to
> indicate that a tag has been deprecated in favor of two or more possible
> values.
>
> tex
>
>
>
> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no [mailto:
> ietf-languages-bounces at alvestrand.no] On Behalf Of Doug Ewell
> Sent: Friday, February 27, 2009 5:21 AM
> To: ietf-languages at iana.org
> Subject: Re: Proposal to remove Preferred-Value field for region YU in LTRU
>
> Tex Texin <textexin at xencraft dot com> wrote:
>
> > Historically, it was a concern that codes might change.
> > If I use the registry to choose the preferred value for a region, and
> > that preferred value can change, then isn't it tantamount to the code
> > changing?
>
> This would have been a good question for the LTRU group back when the
> decision was made to allow Preferred-Value to change.  I'm guessing this
> was about a year ago, but I would have to look it up.
>
> > If I had data that would be represented by YU/CS and after the
> > preferred value is removed it should instead be YU, that seems like a
> > problem.
>
> I guess I'm not sure what you mean by "YU/CS" in this context.  A
> language tag contains at most one region subtag, of course.
>
> > Especially since the relationship between CS and YU becomes lost.
>
> Speaking to this particular case and not to the general principle of
> allowing P-V to change...
>
> It has been argued frequently on LTRU that the relationship between CS
> and YU is not what it appears, because the country identified as YU
> changed its nature dramatically between 1991 and 2003, in a way that was
> pertinent to language identification, by shrinking from the original
> "Yugoslavia" to just Serbia and Montenegro.  This viewpoint holds that
> data tagged as "something-YU" is already ambiguous as to "which YU" is
> intended.  This is really just a special case of the problem that
> country codes as language modifiers are less than perfectly precise.
>
> > Also, it may not be clear which CS records should be restored to YU.
>
> There is never any presumption that someone will go through and retag
> data.  Section 3.1 says, "In particular, the 'Preferred-Value' field
> does not imply retagging content that uses the affected subtag."  To me
> this implies that a change or deletion of P-V doesn't imply retagging
> either.
>
> > I don't see that the fact that the target preferred value of YU is
> > also deprecated is a good reason to break the relationship at this
> > point. We still end up with deprecated codes with no preferred value
> > to go to, so why introduce an unnecessary change?
>
> So that users will not have to follow a chain of arbitrary length to
> determine the best subtag -- or in this case, to reach a dead end.
>
> --
> Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14
> http://www.ewellic.org
> http://www1.ietf.org/html.charters/ltru-charter.html
> http://www.alvestrand.no/mailman/listinfo/ietf-languages  ˆ
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/ietf-languages/attachments/20090228/4522b0c4/attachment-0001.htm 


More information about the Ietf-languages mailing list