RE: draft-phillips-langtags-08, process, sp ecifications, "stability", and extensions

Thu Dec 30 21:19:15 CET 2004

> From: JFC (Jefsey) Morfin [mailto:jefsey at jefsey.com]

> Dear Peter,
> please let focus on the discussion of draft to be approved by the IESG and
> on its role.

Eh???!! I can't imagine what on earth do you think I was talking about if not that.

> This document intends to replace RFC 3066 but does not want
> to
> take into account RFC published since the RFC 3006, the current IANA
> procedures, the work chartered in some WG, the internet architectural
> principle (RFC 1958).
> 
> There is no problem in having it been accepted for information or
> experimental. There are serious objections to get it approved otherwise.

(RFC 1958 was published since RFC 3066??!!)

Look, the IESG chair is the list administrator for the IETF-language list, and a participant in its deliberations. If there has been a serious lacuna in process for moving this draft toward BCP, I think he would have mentioned it on the IETF-languages list a *long* time ago. It was the IESG that issued the Last Call announcement, not me, not the authors of the draft, not anybody else on the IETF-languages list. It appears that IESG *is* moving it through a process toward BCP, and one can only assume they feel their process is adequate and appropriate. 

> >The *meaning* of any given language tag would be no more or less a
> problem
> >under the proposed revision than it was for RFC 3066 or RFC 1766.
> 
> (a) RFC 3066 was published without considering different usages of the
> proposed language tag format.

Eh???!! That is simply not true. RFC 3066 was developed with full awareness and consideration of all the usage scenarios for RFC 1766, which it replaced. RFC 3066 discusses various IETF and W3C protocols that use language tags. (Have you actually read RFC 3066?)

> (b) nor which authority would document their meanings (plural)

Eh???!! Section 2.2 clearly identifies the authorities that document the meanings of subtags. I have pointed out that there are aspects of meaning that it does not address (which, btw, are not easily resolvable), but that does not imply that RFC 3066 was published without consideration of what authorities document meanings.

> >I think we can all agree that there's no much less likelihood of someone
> >.... I suggest that we not dwell on pathological cases that we aren't
> >really likely to encounter.
> 
> This kind of thinking is not appropriate when standardizing a format.
> Julius Caesar would have though a pathological case to propose that Roman
> should speak Londinium's language.

If Romans had started speaking a variant of Londinium's language, the proposed draft could easily accommodate that situation. That is not pathological. A tag like "sr-Latn-CS-gaulish-boont-guoyu-i-enochian" is pathological. It most certainly *is* appropriate to identify what kinds of examples are or not valid, as we need to design for *valid* usage scenarios. For any given character set encoding standard, the fact that nonsense character sequences can be devised is not a determining factor in development of that encoding; the same is true here.

> >At this point, I feel confident that it is not a problem to combine
> script
> >IDs into "language" tags, and this is the consensus of the domain experts
> >that have been discussing this proposed revision for the past year and
> more.
> 
> This may mean that current reluctances to incorporate originating source
> authority, destination, format conformance, internationalization, icons
> support (and may be additional needs) could be a further consensus. I
> suggest that we save time this time.

??? You want to incorporate these things into the draft, or into language tags themselves? The latter is either not necessary or not appropriate (language tags should *not* include anything to indicate destination). As for inclusion in the draft, the proposed draft is quite clear about source authorities for subtags and about conformance; destination is out of scope and irrelevant. Internationalization? These are symbolic identifiers; they are intrinsically not localizable. Icon support??? I haven't a clue what you're talking about!

> >Not a problem: the proposed revision *allows* for the use of script IDs
> >but does not require them. In the case of audio content, one simply would
> >never include a script ID.
> 
> Accents and types of voice have been documented as necessary items. They
> could use the script and police fields ?

??? If someone needs to tag content to distinguish a particular dialect, the proposed draft can easily accommodate that. If one wants to tag content for minor linguistic details ("this utterance was spoken by someone who has a cleft palatte, who was intoxicated at the time, and uses tag-question intonation"), it is a *non goal* of the proposed draft to accommodate that level of detail as it is not appropriate to try to capture that level of ad hoc detail in a general-purpose metadata element.

> >The bigger problem you're pointing out is the limitations of using
> >suffix-truncation alone as a matching algorithm...

> This shows that language matching algorithm should not be addressed in the
> same document. I also submit that this kind of matching policy should be a
> possible decision of the user. Obviously IA rules should be mentionned.

It doesn't show that matching should not be addressed in the same document; it merely shows that one particular algorithm doesn't meet all needs. It would be possible to move all discussion of matching to another document, but I don't see any reason why that must be done. The draft discusses some general considerations and leaves plenty of room for separate specifications for particular matching algorithms for use in particular applications.

> > > Surely some types
> > > of script is indicated by the charset; in situations where that
> > > is not the case, a separate mechanism could be used for that
> > > orthogonal parameter without breaking compatibility with
> > > existing parsers of language tags.
> >
> >This is all a discussion we on the IETF-languages list went through five
> >years ago, and in the intervening five years I think we have reached a
> >consensus on these issues, that consensus being reflected in the proposed
> >revision to RFC 3066. (Note that we made the relevant decisions over a
> >year and a half ago when we reached a consensus to register az-Latn etc.
> >The precedent was established then; the proposed revision adds nothing
> new
> >in this regard.)
> 
> Are we sure that this "others have reached a consensus without your
> objections, so we will not consider them" is a valid form of consensus?

I was merely trying to point out that the questions you are asking are not new, that decisions *have* been taken, and that the results are now part of the Internet legacy. You are certainly welcome to consider whether there's a better way and to propose some entirely new infrastructure for the Internet, but that should not prevent those of us who have been working on the evolution of the existing infrastructure for the past several years to continue to move forward in that evolution.

Or were you suggesting that at any time anybody should be able to question whether standards that have been in use for some time were formed with adequate consensus?

> > > Please see RFC 2026 sections 7.1, 7.1.1, 7.1.3, and 10.1...

> >7.1 says... The fact that not all are used, or that some are
> >used as they were specified in dated version of the ISO standard is not
> in
> >contradiction with 7.1 -- it's just one of "several ways in which an
> >external specification... may be adopted."
> 
> I am sorry but this does not stand. The proposed revision directly refers
> to ISO standards while there are Internet documentation of the way they
> should be used.
> 
> Examples/
> 1. OSI 3166 is refered to. RFC 1591 should. RFC 1591 introduces
> differences
> (we all live with) with OSI 3166 which is taken as a reference to know
> what
> is a country.
> 2. OSI 639 scripting fr-FR is used while RFC 1958 leads to fr-fr or FR-FR
> or FR-fr indifferently and calls for fra-fr to avoid confusion.
> 
> In RFC 1591and RFC 1958 parlance "en-GB" should therefore be "eng-uk"

RFC 1591 and RFC 1958 are specifications for completely separate protocols. Not only is it completely inappropriate to suggest that RFC 1766 or its successors must be subject to these unrelated specifications, to do so would break a large number of existing implementations of RFC 1766 and RFC 3066 (let alone the proposed revision). This is truly nonsense.

> >Thus, I see no difference between RFC 3066 and this proposed revision in
> >relation to compliance with the sections of RFC 2026 you referred to.
> 
> Full agreement. So there is no need for it - except to enhance the RFC
> 3066
> for its specific applications.  This is OK as long as this is clearly
> stated.

The goals for the proposed revision in enhancing RFC 3066 are clearly stated in the draft.

Peter Constable