draft-phillips-langtags-08, process, specifications, "stability", and extensions (Was Language Identifier List Comments, updated)

JFC (Jefsey) Morfin jefsey at jefsey.com
Wed Dec 29 17:02:48 CET 2004

Dear Bruce,
I agree with your comments. I understand that most come from the uncertain 
nature of this mailing list, ambiguously positionned between the IETF and 
the W3C. I commented that to Misha. Now, this might be different if an IETF 
WG was created, this mailing list could comment the proposed deliverable to 
liaise with the W3C - there could be equivalent fora to liaise with ISO, 
MPEG, ccTLDs, ITU, etc.

Would it be possible to conceive this in another perspective? We will have 
the language taging problem at the WG-OPES when dealing with SMTP 
filtering/servicing and possibly (if - as I wish it - we extend to the URL 
presentation as part of a translation service) with IDNs, etc. Would it 
make sense to initiate there an Internet structural RFC which could quote 
and extend the RFC 3066bis.

The matching mechanism is typically an OPES service (to receive the calling 
langtag, to study and negotiate if from a user resume, and to pass the 
resulting acceptable langtag - or to provide a default translation service).


At 16:31 29/12/2004, Bruce Lilly wrote:

> > RE: Language Identifier List Comments, updated
> >  Date: 2004-12-28 18:22
> >  From: "Addison Phillips [wM]" <aphillips at webmethods.com>
> > To: "JFC (Jefsey) Morfin" <jefsey at jefsey.com>, "John Cowan" 
> <jcowan at reutershealth.com>
> >  CC: ietf-languages at alvestrand.no
> > The draft isn't a process draft. Take your process problems to the IETF 
> or IESG (or W3C or appropriate standards body).
>The draft defines a registration procedure; if it did not do so,
>it would probably not be a candidate for BCP (vs. some other type
>of RFC).  Aside from the process/procedure that the draft seeks to
>establish, there are process/procedure issues having to do with
>the origin of the draft, statements about "extensions", and IETF
>procedures and mission as specified in RFCs 2026, 2418, and 3935.
>And, in accordance with the New Last Call and the procedures
>detailed in RFC 2026, the issues are being taken to the IETF/IESG,
>however much some participants in the discussion may dislike those
> > This draft defines language tags.
>Yes. And a registry format technical specification.  And a matching
>algorithm technical specification. In addition to the registration
> > Other drafts, RFCs, specs, etc. define processes and applications that 
> use them. The appropriate use of language tags is the concern of those 
> specifications.
>Per RFC 2026, an application having specific requirements for use
>of Technical Specifications (TS) should provide an Applicability
>Statement (AS) specifying specific requirement levels for each
>TS involved...
> > If there is some text that this draft should carry to help guide 
> implementations, please suggest it so that we can all consider it.
>It would help immensely if the 3 technical specifications (tag
>format, registry format, matching algorithm) were separated as
>separate documents to facilitate reference as independent TSs,
>and to facilitate any individual extensions/revisions, etc.
>that may be necessary in the future, and to keep those separate
>from the registration procedure which itself may need to be
>separately referenced and/or revised.
> > No, the revision clearly expands the scope of language distinctions 
> that can be represented with a language tag--quite significantly in some cases.
>Indeed, and without registration of the tags and the review process
>associated with that (existing RFC 3066) registration procedure. As
>Harald Alvestrand pointed out some time ago, that (inappropriately)
>shifts implementation effort from the tag generator (no registration
>required) to the recipient (what the heck does this mysterious tag
>actually *mean*).
> > But its grammar is much more restrictive, in part to ensure full 
> backwards compatibility with tiny little applications like, oh, say XML.
>It may have been intended to have been more restrictive, but it
>needs work to achieve that goal (as previously discussed in
>XM who?  What about core Internet protocols such as MIME and the
>Internet Message format (STD 11)?  I believe XML is a w3 consortium
>product, not an IETF product.
> > It also restricts future development of compatible language tags in an 
> effort to ensure that implementations of draft-langtags are stable over 
> time and extended in a controlled manner.
>I still believe there is a problem with the proposed method of
>handling "CS", which is destabilizing (given previously documented
>use of "sr-CS" vs. the demise of Czechoslovakia prior to use of
>country codes in language tags (RFC 1766)).  I have yet to see a
>detailed concrete proposal for a general procedure that would
>ensure stability of the current meaning of "CS" embodied in a
>general principle as part of the registration procedure. [N.B.
>making a special-case exception for "CS" doesn't address the issue.]
> > We greatly expanded what can be represented in four major ways:
> >
> > 1. Added script subtags for writing system variations.
> > 2. Mixed generative and private use subtags for private minor 
> distinctions in tags.
> > 3. Extensions for really specialized distinctions.
> > 4. UN M49 region codes, including supra-national regions to represent 
> geographical distinctions not covered by ISO 3166 or by instability in same.
>It's not entirely clear if some of those items (e.g. script) should
>be expressed by an orthogonal mechanism rather than embedded in a
>*language* tag (for that matter, in retrospect, country codes was
>probably a bad idea).
>The whole "stability" brouhaha seems to be a tempest in a teapot.
>Surely the issue could be addressed in a professional manner by
>reaching an agreement with ISO/UN regarding the issue, as has been
>done for the case of 2-letter vs. 3-letter codes and stability of
>existing 3-letter codes.
> > This is dealt with in Section 2.4.2 "Matching". This section clearly 
> details the fallback mechanism (which is compatible with the one in RFC 
> 3066), as well as some considerations for additional matching that can be 
> done by specialized processors that implement a different mechanism. The 
> matching algorithm is the standard one, but is not mandatory. In fact, I 
> have a paper with Jeremy Carroll on a different matching algorithm that 
> an OWL implementation might use. Read this section of the draft carefully.
>I note that Frank Ellerman has raised some issues, but as yet I
>haven't seen any response.  The existence of multiple mechanisms,
>coupled with issues regarding the one proposed in the draft, is
>a strong indication that the matching algorithm should be split
>into a separate document (possibly as one of multiple Experimental
>RFCs, or as a Standards Track or Informational RFC).
> > If one specifies "en-FR", then one should not expect to receive 
> anything less specific than "en-FR".
>Are you referring to use in Accept-Language fields or in Content-
>Language fields (or equivalent accept/send dichotomy)?
> > In software resources generally one specifies the *most specific* 
> (granular) tag that one will accept and may receive less specific content 
> (which may include the default content).
>Indeed; hence the question above. [I also note in passing that
>IETF deals with the Internet in particular, not with "software
>resources generally".]
> > In language tag matching one specifies the *least specific* tag that 
> one will accept and won't receive anything less specific (although you 
> might receive something more specific).
>I'm not sure; if one indicates acceptance of Franglais (en-FR),
>receiving plain en is probably acceptable.  Receipt of en-FR-<Brittany>
>for whatever mechanism is used to indicate the variant of English
>spoken in the region of Brittany (where Breton is a Gaelic language,
>rather than one derived from Latin, like French, or of Germanic root,
>like English) in the country of France, might well be incomprehensible
>to an English-speaking Frenchman from Alsace. [Let's not confuse the
>specific example with the general principle which it illustrates.]
> > The language tag syntax from RFC 3066 itself cannot be changed. 
> draft-langtags carefully adds restrictions to the ABNF and grammar of the 
> tags to ensure that this is so.
>Again, the implementation falls short of the promise.
> > Changing the sources for existing subtags or the interpretation of any 
> particular existing language tag is not permitted if we are to maintain 
> backwards compatibility.
>Agreed that there would be a backwards compatibility problem with
>changing the source.  Which is why there is an issue with "CS" being
>defined in the ISO lists by reference as is currently the case with
>RFC 3066, vs. the proposal to change the source to a separate IANA
>registry which handles "CS" specially (i.e. differently from many
>other ISO-derived codes).
> > To be perfectly blunt: we've worked over a year on this project. If you 
> have specific comments on this draft, with suggestions for improvements, 
> please send those to the list so that they can be viewed by the community 
> and so that Mark and I can address them. Your suggestions for additional 
> changes to the syntax of language tags we find to be incompatible (to the 
> extent that we understand them) with RFC 3066 and our own work on 
> draft-langtags. You will note that draft-langtags can accommodate your 
> requirements using the mechanisms spelled out above and in the draft... 
> so I fail to see what we should change. If you can express that, we'll 
> consider it. Otherwise you are free to do as we did and write your own 
> draft. Internet-Drafts are a volunteer effort and do not write 
> themselves. Neither is there a Star Chamber of people who create them in 
> the dead of night. If you see a need, fill it. I would suggest: wait for 
> draft-langtags to be an RFC and write an extension that does what you want.
>See RFC 2418; specifically section 2.3 and the comment about consensus
>about a wrong design.  See also the RFC 2026 process requirements and
>RFC 2418 procedures; a group which has no charter or equivalent
>document, no written record of meetings, etc. might very well be
>described as "a Star Chamber of people".
>One doesn't write "extensions" to BCP RFCs (that's one of the problems
>with the agglomeration of specifications in the current document); a
>BCP is replaced wholesale (although in theory it might be possible to
>have two related BCPs coexist per the details in RFC 2026 section 6.3;
>but that is unlikely, and in any event the current draft does not
>contain the sort of statement required to coexist with RFC 3066).
>Ietf mailing list
>Ietf at ietf.org

More information about the Ietf-languages mailing list