Return-Path: Received: from murder ([unix socket]) by eikenes.alvestrand.no (Cyrus v2.2.8-Mandrake-RPM-2.2.8-4.2.101mdk) with LMTPA; Mon, 29 Aug 2005 02:40:58 +0200 X-Sieve: CMU Sieve 2.2 Received: from localhost (eikenes.alvestrand.no [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id 72DF6320095 for ; Mon, 29 Aug 2005 02:40:58 +0200 (CEST) Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 12375-09 for ; Mon, 29 Aug 2005 02:40:53 +0200 (CEST) X-Greylist: domain auto-whitelisted by SQLgrey-1.4.8 Received: from megatron.ietf.org (megatron.ietf.org [132.151.6.71]) by eikenes.alvestrand.no (Postfix) with ESMTP id 65B45320091 for ; Mon, 29 Aug 2005 02:40:46 +0200 (CEST) Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1E9XbA-0003k6-Lb; Sun, 28 Aug 2005 20:34:04 -0400 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1E9Xb7-0003jp-3U; Sun, 28 Aug 2005 20:34:01 -0400 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id UAA19648; Sun, 28 Aug 2005 20:33:59 -0400 (EDT) Received: from montage.altserver.com ([63.247.74.122]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1E9XcL-0003M6-QQ; Sun, 28 Aug 2005 20:35:18 -0400 Received: from ver78-2-82-241-91-24.fbx.proxad.net ([82.241.91.24] helo=jfc.afrac.org) by montage.altserver.com with esmtpa (Exim 4.44) id 1E9Xao-0003G7-Ej; Sun, 28 Aug 2005 17:33:42 -0700 Message-Id: <6.2.3.4.2.20050828234319.0597ed50@mail.afrac.org> X-Mailer: QUALCOMM Windows Eudora Version 6.2.3.4 Date: Mon, 29 Aug 2005 02:22:49 +0200 To: "Peter Constable" , "LTRU Working Group" From: "JFC (Jefsey) Morfin" In-Reply-To: References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - montage.altserver.com X-AntiAbuse: Original Domain - ietf.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - jefsey.com X-Scan-Signature: 0ff9c467ad7f19c2a6d058acd7faaec8 Cc: iesg@iesg.org, ietf@ietf.org Subject: STD (was: Last Call: 'Tags for Identifying Languages'toBCP) X-BeenThere: ietf@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: IETF-Discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: ietf-bounces@ietf.org Errors-To: ietf-bounces@ietf.org X-Virus-Scanned: by amavisd-new at alvestrand.no An exchange on WG-ltru documents (I do not say "support": the reader will judge) the positions I support. It involves: - Peter Constable: one of the initiator of the project and author of ISO 639-3 which lists 7500 languages and is used in building langtags - Doug Ewel: author of the Draft concerning the initial content of the registry - Debbie Garside: the author of ISO 639-6 At 22:20 28/08/2005, Peter Constable wrote: >[I'll preface this reply by saying that we don't want to spend too much >time discussing issues that are not of immediate concern while we've got >the matching draft and IETF last call on the registry drafts to deal >with. So, I won't pursue this thread much longer.] The proposed Draft is based upon ISO 639-1,2,3 lists of language names. ISO 639-6 is a list of language use names and IDs. The proposed langtag is an arbitrary limited compound of three information: language name, script and country. A language identification MAY call for far more elements, and deliver much more information. However these three basic elements are necessary to sell lingually related products (contract, ads, documentation, bills) and identify the current status of the art "locales" (CLDR project). The alternative seems to be: - GO for an e-commercial only multilingual internet, for ever. - NO we do not want the Multilingual Internet to be only commercial. The decision is NOW. And we understand Peter and the authors wants to win now, because they have real needs to address now. But I do not think there is a need for anyone to "win". There is a third response. - GO for an e-commercial multilingual internet support now, as default/immediate solution - YES to a generalised Multilingual Internet hooked to the RFC 3066 Draft how poor it is, using its reserved ABNF hooks. This means that: - "fr-Latn-fr" is the default tag based upon ISO 639-1/2/3 - "x-fran" is a private use tag based upon ISO 639-6 - "0-jefsey.com:franver" is my vision of the French at the Palace of Versailles. Documented by an ISO 11179 conformant system (see below) > > From:Doug Ewell > > I'm a bit surprised that a work characterized as a work-in-progress > > and not yet ready for public review is nevertheless deemed ready > > to be considered as a draft international standard. > >Debbie at no point said that it was -- and it is not. It will be >December at the earliest that it can be registered as a CD, and it must >successfully complete a three-month ballot as CD before it can be >registered as a Draft International Standard. So last spring of 2006 at >the earliest. This means that this debate is only to lock a _final_ ABNF via an accepted RFC and a loaded operationalIANA registry _before_ a simpler solution is available three months from now.... > > > In other words, in the system as proposed, you could > > > use either the alpha-4 representation or the unique DI to find the > > > closest 639-1,-2,-3 or -5 tags should you so wish. > > > > But in language tags, either one value needs to be canonical -- sorry, > > "preferred" -- over the others, or else the duplicative values should > > not be added at all. > >Your statement doesn't contradict anything that Debbie has said, >provided the context is ISO 639-6 alone. If we were to talk about >incorporation of ISO 639-6 into a revision of RFC 3066, however, then >duplication would become an issue for consideration. This is the WG-ltru Charter that all the ISO codes be included. As a user I am not much interested in mixing four formats only to please Peter Constable and/or Debbie Garside. All the more than the issue is the addition of the script information to document ... oral expression and they miss computer(ised?) languages (definition?) and all this is through computers. >For clarification of Debbie's statement, in the model of ISO 11179, we >have metadata elements that consist of a data element concept, such as >'English', and a representation for that, such as "en" or "eng" (these >would be distinct representations belonging to different value domains). >Within an metadata registry, a registry item corresponding to 'English' >can have a Data Identifier (DI), which is a unique identifier *within >the registry* for that administered item; in this example, that DI could >be any number of strings, though "English" would be among the better >choices. Nice to see that ISO 11179 is accepted now. Peter Constable and the WG-ltru have opposed the reference to ISO 11179 model. This model permits to conceptualise languages and to include in their description an unlimited number of additional elements. Roughly it means that ISO 639-3 is a table of codes (names) related to non documented languages. While ISO 639-6 wants to be a root to a base of objects describing languages. The Draft proposes a very limited version of that base with three columns only. This is enough in many cases. But not in an increasing number of cases. Hence the possibility to use the Draft as a default. Since the three elements of the Draft's langtag are also in the language object base. CLDR (Unicode locale project) is a langtag related base. But ISO 11179 totally open the concept, like C structures: data they may indefinitely expand as metadata. Each data has the possibility to become a metadata describing a new dataset. For example "fr-latn-FR" can use the three codes as data. This is the case of the Draft. Or it can use the ISO 639-1 "fr" data, as the meta data describing all the French dialects and many new names to be listed (which are in ISO 639-6). "Latn" element of ISO 15924 for Latin script can become a meta data introducing French charset within a (to be defined) Latn concept, and also include elements such as founts, thickness, color, etc. etc. This is the approach I support: this is simple enough in extending the DI concept in a network, but this has be to discussed within ISO/TC 32 (or in a dedicated WG on reference centers?) > > Will Linguasphere provide the mapping between the new alpha-4 codes > > and ISO 639-1/2/3, or is that something a group like this would have to >do? >They will be (and must) providing that. This insures full compatibility between all the visions using ISO 639-X codes (documented by the ISO 639-4 guidelines under work). > > I agree that the broad question of "what is a language" is out of our > > scope. The more specific question "what is a taggable language > > distinction" is perhaps more germane. > >Not an unreasonable suggestion. This is a major step ahead for common understanding! My request, plea, begging, for a definition of what we intend to mean as a "language" in the Draft contest, might after all be listen to. I would then advise that the Draft is sent back to the WG-ltru, with the suggestion that a lexicon is provided which would define what is a "language", a "script", a "country", and the purpose (informative, descriptive, normative?) of a langtag. This might be a big step ahead. jfc _______________________________________________ Ietf mailing list Ietf@ietf.org https://www1.ietf.org/mailman/listinfo/ietf