RFC 3066bis: Philosophical objection (harsh)

Sun Dec 28 21:55:53 CET 2003

I agree with all of Addison's comments; he and I spent a good deal of time
discussing the different features that went into each draft, and the 02 draft
should address a number of issues that people have raised. We also indicated the
reasons for some of the changes in Section 6. I don't know when 02 will appear
(it has been 10 days now since our submission was acknowledged, but no sign of
it yet!), so I'll copy that section here:

==============
6. Changes from RFC3066

   The main goals were to maintain backward compatibility (so that all
   previous codes would remain valid); reduce the need for large numbers
   of registrations; to provide a more formal structure to allow parsing
   into subtags even where software does not have the latest
   registrations; to provide stability in the face of potential
   instability in ISO 639, 3166, and 15924 codes (*demonstrated
   instability* in the case of ISO 3166); and to allow for external
   extension mechanisms.

   o  Allows ISO15924 script code subtags and allows them to be used
      generatively.

   o  Adds the concept of a variant subtag and allows variants to be
      used generatively.

   o  Adds an extension mechanism which does not require registration to
      use.

   o  Defines the private use tags in ISO639, ISO15924, and ISO3166 as
      the mechanism for creating private use language, script, and
      region subtags respectively

   o  Defines a syntax for private use variant subtags which can be used
      without registration.

   o  Defines a process for handling reuse of values by ISO639,
      ISO15924, and ISO3166 in the event that they register a previously
      used value for a new purpose.

   o  Changes the IANA language tag registry to a language subtag
      registry

   Substantive changes between draft-01 and this version are:

      Removed the year subtag

      Changed from EQUALS SIGN to FULL STOP in the extension mechanism

      Added an IANA Considerations section

      Fixed the ABNF

      Changed the name of the document.

      Updated the introduction, by combining text suggested by Peter
      Constable with the existing text and a few of our own revisions.

      Added a TOC.

      Added the subsection Section 2.4.2 on matching language tags.

      Revised the rules on choosing languages tags slightly in Section
      2.3.

      Added this section.

Mark
__________________________________
http://www.macchiato.com
► शिष्यादिच्छेत्पराजयम् ◄

----- Original Message ----- 
From: "Addison Phillips [wM]" <aphillips at webmethods.com>
To: "Harald Tveit Alvestrand" <harald at alvestrand.no>;
<ietf-languages at alvestrand.no>
Sent: Sun, 2003 Dec 28 10:55
Subject: RE: RFC 3066bis: Philosophical objection (harsh)

> Hi Harald,
>
> Thanks for your comments back. I've been waiting for the -02 draft to be
> posted. I don't know why it isn't posted yet, since Internet-Drafts
> submitted after ours have been posted. Some of your concerns in your
> original email and below have been, I think, addressed in that draft.
>
> I don't think that we're the only two talking. Others have weighed in here
> and there--six or eight folks. And there was a quite lengthy discussion this
> past summer (you'll no doubt recall) in which the inclusion of ISO15924 was
> thoroughly debated. There have even been quite a number of comments since
> then along the lines of "Since we're all waiting for a new RFC..."
>
> Well, Internet-Drafts don't write themselves, so Mark and I wrote this one
> to collect all of that discussion together. I think there may be relative
> agreement to at least the core of our proposal as a result and few people
> will write in unsolicited to say "okay, I agree". There is also the holiday
> season to contend with...
>
> I have compressed your comments and removed most of mine in my response
> below. This is a personal response and not necessarily indicative of Mark's
> opinion.
>
> Best Regards,
>
> Addison
>
> Addison P. Phillips
> Director, Globalization Architecture
> webMethods | Delivering Global Business Visibility
> http://www.webMethods.com
> Chair, W3C Internationalization (I18N) Working Group
> Chair, W3C-I18N-WG, Web Services Task Force
> http://www.w3.org/International
>
> Internationalization is an architecture.
> It is not a feature.
>
> > -----Original Message-----
> > From: Harald Tveit Alvestrand [mailto:harald at alvestrand.no]
> > Sent: samedi 27 decembre 2003 10:40
> > To: aphillips at webmethods.com; ietf-languages at alvestrand.no
> > Subject: RE: RFC 3066bis: Philosophical objection (harsh)
> >
> > >
> > > Let's start with whole tag vs. subtag registration. Whole tag
> registration
> > > works well when there are only a very few exceptional tags expected or
> > > when atomic tags completely cover the needs of the users.
> >
> > I would put this slightly differently: The whole tag registration works
> > well when the one who wants to start using a tag (usually the tag
> > generator) is willing to put work into the tag (registering) before
> > starting to use it.
> > Subtag registration works well when the recipient is willing to
> > handle tag combinations where the recipient has no idea what they "mean",
> and is
> > satisfied with making educated guesses based on the identity of
> > the subtag components.
>
> Whole tag registration doesn't work well for recipients, as evidenced by the
> fact that few implementations support them. My experience is that they are
> more difficult to program for, since a parser by itself is easier to write
> than a parser combined with a lookup table.
>
> In any case, I disagree with your characterization of the difference between
> tags that the recipient knows the meaning of and tags where the recipient
> has no idea. The set of tags that an implementation recognizes and can do
> something meaningful with is already smaller than the set of all possible
> tags. Our proposal makes it easier for implementations to assign value or
> meaning to the unrecognized set of tags. It also makes the structure of the
> tags far more rigorous, making registered values of any sort (whole-tag or
> subtag) more regular and thus easier to process.
> >
> > All implementations must be able to handle tags that they have not seen
> > before. The maintenance of the table is a problem - but if that
> > is the core
> > problem, we should look at solving this problem (such as by defining a
> > fixed format for the table that can be downloaded from IANA, for
> > instance).
> >
> > But if generative tags are used, the two German ortographic variations
> > wouldn't have caused eight "registrations" - they would have generated a
> > near-infinite number of variations, most of which would be meaningless
> > (no-CN-1905?)
>
> Meaningless tags are with us already. They don't seem to cause a lot of
> problems because, in practice, no one uses these legal but ridiculous codes.
> In our draft, the onus is still on the tag generator (as termed above) to do
> the work of registration. What is removed is the need to register many
> variations and the need to convince the community to register multiple
> levels of tags, if they are needed. This doesn't remove the need to convince
> the community, provide supporting documentation, and so on. It does increase
> the likelihood that a registrant will be able to register all of the tag
> variations that they feel they need without "registration fatigue" setting
> in.
>
> It also will lead, I think, to the registered values actually being
> implemented (at least on a rudimentary level, as supported "unrecognized
> stuff") in browsers, XML parsers, Web servers, mail readers, and so forth. I
> think that's an attraction: registered values actually can be used.
> >
> > > 4. "Silly subtag generation" should not be an issue. It has always been
> > > possible to create 'silly' tags or at least tags with dubious
> > > meaning with the generative mechanism. 'es-AQ', 'sv-CO', et cetera.
> >
> > Yes, and at times I think that the inclusion of the ISO 639 generative
> > mechanism in RFC 1766 was a mistake, exactly for this reason.
>
> It's here and it works. Let's not worry so much about Klingon for the
> Neutral Zone or Norwegian for Chile. I think we should concentrate on having
> a model that provides the right level of granularity and structure for the
> job. Going with a table driven mechanism instead of a generative mechanism
> would, I think, be a step backwards.
> >
> > > The description of
> > > the registry in the draft is designed to capture the meaningful
> > uses that
> > > a subtag can be put to, without limiting the subtag's use in the
> > > generative mechanism. Implementations might limit registered subtags to
> > > their informative uses.
> >
> > But if there is no whole-tag registration, what is the hard rule
> > that draws the distinction between "informative" and "non-informative"
> uses?
> > If there is a rule, we're really back with whole-tag registration.
>
> There is no rule. I wrote "might" to indicate what an implementor might
> decide to do. I actually think that very few implementations, if any, will
> provide the ability to construct the user's own tag from subtags at random.
> Instead they will allow users to choose from a list of predetermined tags or
> to type in their favorite tag. Implementations that build more complex
> mechanisms that do allow for generation of any combination of subtags might
> therefore also be willing to commit to the extra effort of providing for or
> restricting to informative use each registered subtag.
>
> Again the question is where the burden should lie. With whole tag
> registration, the burden is on the recipient to have an up to date table of
> values and to deal with cases in which unexpected values are received (what
> to do?). With subtags, the burden is on the sender to choose, but choose
> wisely, the tag that best describes the content. The recipient can then
> assign as much meaning as possible to the value, which may not be enough
> (from the sender's perspective), but may be better than nothing at all (the
> current result).
> >
> > > which basically say: "Use the most exact tag that you can, but no more
> > > exact than is strictly necessary", which effectively says "use
> > en-US, not
> > > en-Latn-US". More guidance here might be provided...
> >
> > Saying something "effectively" often proves in practice to be not saying
> > anything at all - people who do not understand the field will make the
> > wrong choice unless the guidelines are 100% clear.
>
> This was previously discussed at length (during the big blast of messages
> earlier in the year). Most people felt that script codes belong before
> country codes in tags. Strict tag matching compatibility between the draft
> and RFC3066 would require moving the script code after the country code, at
> the expense of moving them from the best semantic position.
>
> > Whole-tag registration limits the number of people who can make
> > mistakes to
> > those who try to register tags. Subtag registration pushes the ability to
> > make mistakes out to the implementors and users.
>
> Already we have registrations that make tag choices unclear: 'zh' or
> 'zh-hans'? 'de' or 'de-1996'?
>
> Another way of looking at subtags is that it changes the structure of the
> registry, not of implementations. Yes, it allows silly tags to 'escape' into
> the wild, but there has always been a burden on the user to choose the tag
> wisely. With subtags at least the impact of a 'mistake' may be muted by the
> fact that the recipient may look inside the tag (in a strictly mandated way)
> to find information about the language.
> >
> > >
> > > Extensions in general. We have contemplated adding rules to make
> > > extensions default ignorable, but that seems overly limiting, at least
> > > for a first pass. The extension mechanism we propose provides a way to
> > > pass
> > > language-related metadata in a more structured manner, and even in a
> > > combinatorial manner (using two extension regimes together).
> > Yes, this is
> > > more complex than the current system and we could just stick
> > with "value"
> > > subtags for extensions. But we felt that kay/value provided a powerful
> > > mechanism that could address some of the additional needs of specialized
> > > communities without disturbing the base tags at all.
> >
> > It is a very powerful mechanism. It is also completely useless for open
> > interchange without a registration mechanism or similar way to
> > discuss the "meaning" of the extension.
> >
> > So I still don't understand this one:
> > - If you want it for private exchange, why is it appropriate to use
> > language tags, which are designed for open exchange?
>
> Private extensions allow for very fine grained tagging, possibly of interest
> to a small circle of users, while preserving the possibility of general
> exchange. For example, a Web site that discussed details of dialect
> variation might use extensions to label side-by-side examples, which might
> then be styled differently using CSS, whereas the browser itself would only
> need/see the base language tag to know what rendering rules to apply.
>
> > - If you want it for standardized exchange, why don't you describe the
> > registration work?
>
> Because I think there may be several such extensions. I have one in mind,
> but others will have different needs. Perhaps what is needed is a registry
> for a 'namespace' for such extensions. That's a good idea for a future
> draft...
> >
> > > Undefined Extensions. I envision that external groups with interest in
> > > using the extension mechanism will define the keys and values. It just
> > > didn't seem to make sense to me to saddle IANA with registering those
> > > values. A separate registry for extensions or extension namespaces could
> > > be created. I suppose we could add one...
> >
> > If external groups use it, they will either have to set up a registry or
> > live with the risk of clashing definitions. Registries are cheap.
>
> Agreed.
> >
> > >
> > > I look forward to submitting draft-02 and to your comments on that
> > > version.
> >
> > I will definitely comment. And I do hope that other people on the
> > list will make their opinions known.
>
> Me too.
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>