Language tags and IETF/W3C liaison

JFC (Jefsey) Morfin jefsey at jefsey.com
Thu Dec 30 20:27:50 CET 2004


Dear Misha,
stating your own feelings in Adhominem instead of documenting
does not necessarily help. Nor quoting our resumes. There are
facts.

I documented precisely:

1. items which are missing in the proposed language tags
     for them IMHO to be able to adequately support the Internet
     standard process needs as per today (not four years ago).

2. discrepancies with the Internet architecture principles.

3. that these points do not prevent to use the proposed draft
     in the context the RFC 3066 is currently used, all the more
     than many have documented it urgent use in that context.

These are points you may want to contest, in showing me that
I poorly read RFCs and IANA procedures, or that I am wrong in
supporting the Phillips-language-08.txt Draft. The rest is
inappropriate.

4. the correction of the current discrepancies, the support of
     the missing points and of chartered work, and the global
     evolution of the Internet architecture towards a full
     multilingualism call for a liason with the convergence of
     application level WGs relying on the Internet architecture.
     This should be under an IAB guidance to ask along with
     the Internet standard process rules.

This is a point you introduced and I approved. May be you
now feel you were wrong. I don't. This is a correct procedure.

Now, since you want to introduce a personal touch and asked, I went to 
http://www.unicode.org/iuc/iuc20/b001.html (your personal biography) to 
understand your own locale. I searched in vain for the string "IETF". This 
only shows that we are considering different layers. I fully understand and 
accept your concerns and I still do not see why I would need to climb the 
Kilimanjaro to support the same proposition as you (or are you already 
there?). I also fully understand that you have difficulty understanding my 
language, since my language tag is rfc-1591-1958-2026-3869, with a user 
dialect and a Franglish flavor (may be documented in OSI 639-6 sometime?).

But this is precisely for that reason that I answered you. There is no need 
to liaise wrong people from the same side. This is in correcting one 
another that we can progress together.

Take care.
jfc

On 30/12/2004, Misha Wolf wrote:

>JFC,
>
>Reading your mails, I have a strong feeling of unreality.  As they lack
>any language tag, I am unable to check the language they are written in,
>but Klingon seems a distinct possibility.  In any event, you seem to
>happily declare that black is white and that the Sun rises in the West.
>
>Below, you criticised RFC 3066 and the draft RFC 3066bis for not using
>your own pet scheme for language tags.  When I responded, telling you
>that this would break the Web, you answered with:
>
> > I fail to see in what they would be incompatible with RFC 3066.
>
>It has long been clear to me that you don't bother digesting the mails
>written by other contributors before launching into yet another
>mind-boggling essay.  I now see that you also don't bother digesting
>your own mails.
>
>You seem to totally disregard clear statements from people who have
>worked in this field for many years, some of them in leading positions
>in industry consortia, such as the W3C and Unicode.
>
>I strongly urge you to retreat to a mountain top somewhere and carefully
>study the way in which language tags are used in HTTP, HTML and XML,
>before writing another word on this subject.
>
>Misha Wolf
>Standards Manager
>Chief Architecture Office
>Reuters
>
>
>-----Original Message-----
>From: JFC (Jefsey) Morfin [mailto:jefsey at jefsey.com]
>Sent: 30 December 2004 17:52
>To: Misha Wolf
>Cc: ietf-languages at alvestrand.no; ietf at ietf.org; WWW International
>Subject: Re: Language tags and IETF/W3C liaison
>
>At 17:51 30/12/2004, Misha Wolf wrote:
>
> >JFC,
> >
> >Your proposals would, as Martin has written, break millions
> >of deployed Web pages.  You have suggested that the IETF
> >should not concern itself with the use of language tags in
> >W3C standards.  It would be quite unacceptable for the IETF
> >to go for a solution which breaks the widesparead use of
> >language tags on the Web (eg, as per your proposals below).
>
>I fail to see in what they would be incompatible with RFC 3066. In any
>case
>it would be quite unacceptable for the IETF to go for a solution which
>would contradicts its architecture or limitate it to its RFC 3066's
>time.
>
> >I am increasingly thinking that this matter may need to be
> >referred to the IETF/W3C liaison group.
>This is what I think reasonable. It might be the occasion of a major
>leap
>forward for both based upon multilingualism/vernacular support. So it
>has
>to be prepared. The best way would be to mention the framework of the
>present draft (I would say an improved support of the
>RFC 3066 compatible usage) in its introduction, to speed its approval
>and
>to start working on a possible common replacement or a different
>coordinated format.
>
>jfc
>
>
>
> >Misha Wolf
> >Standards Manager
> >Chief Architecture Office
> >Reuters
> >
> >
> >-----Original Message-----
> >From: ietf-languages-bounces at alvestrand.no
> >[mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of JFC (Jefsey)
> >Morfin
> >Sent: 30 December 2004 15:46
> >To: Peter Constable; ietf-languages at alvestrand.no; ietf at ietf.org
> >Subject: RE: draft-phillips-langtags-08, process, sp ecifications,
> >"stability", and extensions
> >
> >Dear Peter,
> >please let focus on the discussion of draft to be approved by the IESG
> >and
> >on its role. This document intends to replace RFC 3066 but does not
>want
> >to
> >take into account RFC published since the RFC 3006, the current IANA
> >procedures, the work chartered in some WG, the internet architectural
> >principle (RFC 1958).
> >
> >There is no problem in having it been accepted for information or
> >experimental. There are serious objections to get it approved
>otherwise.
> >
> >At 13:26 30/12/2004, Peter Constable wrote:
> > > > So why not then also throw in the closely linked specification of
> > > > the Content-Language field, which has historically been in the
>same
> > > > document (RFC 1766)?
> > >
> > >It was removed in the development of RFC 3066, which was appropriate
> > >because it was a particular application involving language tags;
>other
> > >applications exist, and other applications may use different
>approaches
> >
> > >for how matching should be done.
> >
> >This means it is only one occurence of a to be defined standard.
> >
> > >The *meaning* of any given language tag would be no more or less a
> >problem
> > >under the proposed revision than it was for RFC 3066 or RFC 1766.
> >
> >(a) RFC 3066 was published without considering different usages of the
> >proposed language tag format.
> >(b) nor which authority would document their meanings (plural)
> >
> > >I think we can all agree that there's no much less likelihood of
> >someone
> > >.... I suggest that we not dwell on pathological cases that we aren't
> > >really likely to encounter.
> >
> >This kind of thinking is not appropriate when standardizing a format.
> >Julius Caesar would have though a pathological case to propose that
> >Roman
> >should speak Londinium's language.
> >
> > >Of course it would not be clear if you don't have a conceptual model
>of
> >
> > >what "language" tags are identifiers *of*. When RFC 3066 was being
> > >developed, there was a suggestion that script IDs be incorporated,
>but
> > >some were reluctant, raising the same question you have here. I was
>one
> >of
> > >those. But I didn't remain obstructionist over the issue; instead, I
> >gave
> > >a fair amount of thought to the ontology that underlies "language"
> >tags,
> > >and subsequently published a white paper and presented on the topic
>at
> >two
> > >conferences in the spring and fall of 2002. (Paper is available
>online
> >at
> > >http://www.sil.org/silewp/abstract.asp?ref=2002-003 -- my thinking
> > has >evolved since then, but some key results remain valid, I think.)
> >
> >May us know which ones?
> >
> > >At this point, I feel confident that it is not a problem to combine
> >script
> > >IDs into "language" tags, and this is the consensus of the domain
> >experts
> > >that have been discussing this proposed revision for the past year
>and
> >more.
> >
> >This may mean that current reluctances to incorporate originating
>source
> >
> >authority, destination, format conformance, internationalization, icons
> >support (and may be additional needs) could be a further consensus. I
> >suggest that we save time this time.
> >
> > >Not a problem: the proposed revision *allows* for the use of script
>IDs
> >
> > >but does not require them. In the case of audio content, one simply
> >would
> > >never include a script ID.
> >
> >Accents and types of voice have been documented as necessary items.
>They
> >
> >could use the script and police fields ?
> >
> > >The bigger problem you're pointing out is the limitations of using
> > >suffix-truncation alone as a matching algorithm. In the discussion
> > >following the registration request for de-1996, etc., there was some
> > >discussion as to whether de-1996-DE format or de-DE-1996 format was
> > >preferable, and in the course of that discussion it was mentioned
>that
> > >some times the 1901 vs 1996 spelling differences would be more
> >important
> > >than the regional dialect differences, but in other situations the
> > >regional differences would be more important than the spelling. But
>the
> >
> > >problem with prefix matching used e.g. for Accept-Language is that
>only
> >
> > >one of these two can be supported. That is a shortcoming in that
> >application.
> >
> >This shows that language matching algorithm should not be addressed in
> >the
> >same document. I also submit that this kind of matching policy should
>be
> >a
> >possible decision of the user. Obviously IA rules should be mentionned.
> >
> > >Note that there is nothing that prevents other applications from
>using
> > >other matching algorithms, including perhaps something that is able
>to
> > >recognize in "az-AZ" and "az-Latn-AZ" that both involve Azeri and
>used
> >in
> > >Azerbaijan.
> >
> >And that as user you may have more readibility with of one form than
>the
> >other.
> >
> > > > Surely some types
> > > > of script is indicated by the charset; in situations where that
> > > > is not the case, a separate mechanism could be used for that
> > > > orthogonal parameter without breaking compatibility with
> > > > existing parsers of language tags.
> > >
> > >This is all a discussion we on the IETF-languages list went through
> >five
> > >years ago, and in the intervening five years I think we have reached
>a
> > >consensus on these issues, that consensus being reflected in the
> >proposed
> > >revision to RFC 3066. (Note that we made the relevant decisions over
>a
> > >year and a half ago when we reached a consensus to register az-Latn
> >etc.
> > >The precedent was established then; the proposed revision adds
>nothing
> >new
> > >in this regard.)
> >
> >Are we sure that this "others have reached a consensus without your
> >objections, so we will not consider them" is a valid form of consensus?
> >
> > > > Please see RFC 2026 sections 7.1, 7.1.1, 7.1.3, and 10.1.
> > > > Note that RFC 3066 strictly complies with those sections, while
> > > > the draft under discussion, by cherry-picking from ISO lists
> > > > for which change control has not been transferred to the IESG,
> > > > does not.
> > >
> > >7.1 says,
> > >
> > ><quote>
> > >To avoid conflict between competing versions of a specification, the
> > >    Internet community will not standardize a specification that is
> > >    simply an "Internet version" of an existing external
>specification
> > >    unless an explicit cooperative arrangement to do so has been
>made.
> > >    However, there are several ways in which an external
>specification
> > >    that is important for the operation and/or evolution of the
> >Internet
> > >    may be adopted for Internet use.
> > ></quote>
> > >
> > >The proposed revision does not create Internet-specific versions of
>ISO
> >
> > >standards; it uses IDs drawn from ISO standards with semantics
>defined
> >in
> > >those source standards at the time they were adopted for use in
> >language
> > >tags -- the source for the IDs, the symbols and their meanings all
> >reside
> > >in the ISO standards. The fact that not all are used, or that some
>are
> > >used as they were specified in dated version of the ISO standard is
>not
> >in
> > >contradiction with 7.1 -- it's just one of "several ways in which an
> > >external specification... may be adopted."
> >
> >I am sorry but this does not stand. The proposed revision directly
> >refers
> >to ISO standards while there are Internet documentation of the way they
> >should be used.
> >
> >Examples/
> >1. OSI 3166 is refered to. RFC 1591 should. RFC 1591 introduces
> >differences
> >(we all live with) with OSI 3166 which is taken as a reference to know
> >what
> >is a country.
> >2. OSI 639 scripting fr-FR is used while RFC 1958 leads to fr-fr or
> >FR-FR
> >or FR-fr indifferently and calls for fra-fr to avoid confusion.
> >
> >In RFC 1591and RFC 1958 parlance "en-GB" should therefore be "eng-uk"
> >
> > >Thus, I see no difference between RFC 3066 and this proposed revision
> >in
> > >relation to compliance with the sections of RFC 2026 you referred to.
> >
> >Full agreement. So there is no need for it - except to enhance the RFC
> >3066
> >for its specific applications.  This is OK as long as this is clearly
> >stated.
> >
> > >RFC 3066 was developed in exactly the same manner as this proposed
> > >revision has been developed -- as an internet draft prepared by a
> >member
> > >of the the IETF-languages list and processed among members of that
>list
> >
> > >until it was submitted for last call and subsequent IESG action.
> >
> >This certainly rises the question of a dedicated WG. This is a question
> >to
> >ask the IAB, because the support of multilingualism by the Internet
> >standard process is very complex and has much more implications on the
> >whole internet architecture than anything else. This is clearly shown
>in
> >
> >RFC 3869 where IAB does not even quote the issue, as if
> >IAB/IRTF/IESG/IETF
> >were just for a monolingual technology having to support some limited
> >internationalization.
> >
> >IMHO the lack of IAB guidance in that area is the real source of
> >confusion
> >in here. I started working on a Draft concerning the support of
> >multilingulization by the Internet standard process. This seems to show
> >that the Internet standard process can cope with it, as it is today.
>But
> >
> >that some IAB guidance is required and some external intertechnology
> >(due
> >to the digital convergence) specialized assistance are necessary. The
> >real
> >problem, as far as the IETF is concerned, is to work on the best scope
> >of
> >the requests to the IAB.
> >
> >The danger is a confusion created by non-chartered WGs, patching some
> >existing but not all the currently existing needs. The simple answer is
> >just to tell which needs they want to address, and not to standardize
> >their
> >use anywhere else without great care.
> >jfc
> >
> >
> >
> >
> >
> >_______________________________________________
> >Ietf-languages mailing list
> >Ietf-languages at alvestrand.no
> >http://www.alvestrand.no/mailman/listinfo/ietf-languages
> >
> >
> >
> >-----------------------------------------------------------------
> >         Visit our Internet site at http://www.reuters.com
> >
> >Get closer to the financial markets with Reuters Messaging - for more
> >information and to register, visit http://www.reuters.com/messaging
> >
> >Any views expressed in this message are those of  the  individual
> >sender,  except  where  the sender specifically states them to be
> >the views of Reuters Ltd.
>
>
>
>
>-----------------------------------------------------------------
>         Visit our Internet site at http://www.reuters.com
>
>Get closer to the financial markets with Reuters Messaging - for more
>information and to register, visit http://www.reuters.com/messaging
>
>Any views expressed in this message are those of  the  individual
>sender,  except  where  the sender specifically states them to be
>the views of Reuters Ltd.



More information about the Ietf-languages mailing list