Return-Path: Received: from eikenes.alvestrand.no ([unix socket]) by eikenes.alvestrand.no (Cyrus v2.1.11-Mandrake-RPM-2.1.11-1mdk) with LMTP; Wed, 02 Mar 2005 17:42:08 +0100 X-Sieve: CMU Sieve 2.2 Return-Path: Received: from localhost (localhost.localdomain [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id 1195361B8B for ; Wed, 2 Mar 2005 17:42:08 +0100 (CET) Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 03839-07 for ; Wed, 2 Mar 2005 17:42:05 +0100 (CET) Received: from megatron.ietf.org (megatron.ietf.org [132.151.6.71]) by eikenes.alvestrand.no (Postfix) with ESMTP id 9543761BD8 for ; Wed, 2 Mar 2005 17:42:03 +0100 (CET) Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1D6WtT-0005lg-3M; Wed, 02 Mar 2005 11:40:15 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1D6WtR-0005lY-2n for iesg@megatron.ietf.org; Wed, 02 Mar 2005 11:40:13 -0500 Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA00166 for ; Wed, 2 Mar 2005 11:40:09 -0500 (EST) Received: from montage.altserver.com ([63.247.74.122]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1D6WuZ-0004n1-LH for iesg@ietf.org; Wed, 02 Mar 2005 11:41:26 -0500 Received: from lns-p19-8-idf-82-249-19-191.adsl.proxad.net ([82.249.19.191] helo=jfc.afrac.org) by montage.altserver.com with esmtpa (Exim 4.44) id 1D6Wsa-000301-Fs; Wed, 02 Mar 2005 08:39:21 -0800 Message-Id: <6.1.2.0.2.20050302123803.035bc1f0@mail.jefsey.com> X-Sender: jefsey+jefsey.com@mail.jefsey.com X-Mailer: QUALCOMM Windows Eudora Version 6.1.2.0 Date: Wed, 02 Mar 2005 17:39:17 +0100 To: iesg@ietf.org From: "JFC (Jefsey) Morfin" Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - montage.altserver.com X-AntiAbuse: Original Domain - ietf.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - jefsey.com X-Scan-Signature: 932cba6e0228cc603da43d861a7e09d8 Cc: randy_presuhn@mindspring.com Subject: WG-ltru proposition by Randy Presuhn X-BeenThere: iesg@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: iesg.ietf.org List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: iesg-bounces@ietf.org Errors-To: iesg-bounces@ietf.org X-Virus-Scanned: by amavisd-new at alvestrand.no After reviewing the proposition of Randy Presuhn to create a WG-ltru and the current situation on the RFC 3066 revision, and being one of the leading opponents to the confusion I expected to result from the bundled propositions of the RFC 3066 bis as a new RFC 047, 1. I support the proposition as it is proposed. It represents major steps ahead: - the Draft is split in two documents - the Draft will be issued by an IETF WG with all the necessary exposure, under a IESG approved and IAB reviewed charter. - the private ietf-languages@alvestrand.no mailing list which has a linguistic oriented expertise will be adequately used. I am sure the resulting process will issue propositions respecting all the needs including the ones fully supported by the current BCP 047 2. I would add comments to tune or extend the proposed charter. - the role of the IANA as the Internet protocols parameter clearinghouse should be repeated. The RFC 3066bis confused needs by users and needs by computers: to support documentary information by the IANA is OK with me as long as this is documented as an _additional_ task with appropriate rules. This will avoid confusion between an IDN Table tabtag to be used to filter IDNs and a langtag used to tell the users which language they should know to read a page/mail. - two different kind of needs should be clearly differentiated: (a) the need to extend RFC 3066/BCP 047 to offer a consistent doctrine on the way the IETF/Internet standard process use external tables, making the IANA the reference over external source, with possible addition to the IANA which will then manage conflicts. This is the case for ccTLD. Examples which could be documented are ISO 639, 3166-1 and 3166-3, ISO 7000, etc. but also other tables which could be proposed (I work on a national French reference center and on tables to document proximity Internet - GIX, Cities, etc. - http://afrac.org/wdf.org). This document should address the tables owner's copyright, tags fundamental concepts as a variable name (for example in forms, for OPES, etc.) and numbering in nomenclatures, synonyms, semantics, grouped and individual codes registration, etc. This is why I called for a WG-Tags. (b) the specific needs for a language tag RFC. This RFC should permit three different types of usage: b.1 - to continue and stabilize the current usage which works fine in IDN. The addition of the script descriptor is a good point for languages like Chinese, Japanese and even French and American (support of Hispanic or other community person names) uses different scripts. b.2 - to support the specific needs of applications such as W3C and, may be, other application standardization groups (I think to domotic?) b.3 - to support a generalized description of the language used in lingual interoperations (simple web services), interintelligibility (mail lists and OPES - I support with the notion of authoritative reference) and interusability (common procedures of usage - I support with the notion of style). NB. I consider the two first types can be default descriptions in a clearly identified context : b.1 when the context is structural (like IDNs). b.2. when the context can be negotiated/assigned. For example an HTML page will use a langtag based menu using standardized language names (what the Draft 3066 bis is about). I note that the very end to end nature of the internet makes the whole system a singe unique context. By default the Internet is an ASCII system. Interoperabilty is warranted that way. Interintelligbility is OK as long as the language is Basic American (I know, with my Franglish and the resulting problems). User interusability is more or less supported at application level (we all know the variations in Javascript, Browsers, cookies, etc.). So far, IETF only worked on internationalization (the capacity to support non-ASCII items, but it did not go very far into it yet (the IDN Tables is a sort of locale, the concept of charset is similar to scripts but I am not sure about the exact relations): langtags is another step ahead. But we should not confuse the three layers of a "Multilingual Internet": - internationalization: the capacity to support all the languages - multilingualization: the capacity to support each language in a separate way - vernacularization: the capacity to support each language in specific users' contexts. Considering that what has been achieved to date as a support of languages is real layer violation and source of confusion and conflicts. To support various contexts for the entire Internet or parts of it is more complex. To respect the end to end interoperability we have to consider network restrictions where end to end is between "everyone with the same language" to get a full interintelligibility. The same to restrict end to end interoperability to "everyone following a given set of procedures" to get full interusability. This is not that complex and I believe that the Internet and the DNS do have most of the necessary built-in tools (the IDNs complexity IMHO comes from a lack of a normal use of the DNS zone: if you accept that ASCII is the default IDN table and that therefore .com should actually read ".ascii.com", and respect the DNS zones, you address the problem in a different way). Therefore defining properly the langtags is a major step ahead towards a Multilingual Internet. So far the analysis carried in common seems to show that: - internationalization calls for three descriptors: language, script, geopolitical location. - multilingualization calls for one descriptor: the authority (in terms equivalent to the authoritative reference for a DNS zone). cf. infra. - vernacularization can call for one descriptor, I name the "style", which actually may want to use a list of descriptors. This leads to a very fine granularity: there are 7260 languages in ISO 639, 240 country codes, 100 scripts in ISO 15924, 20000 dialects in other tables, thousands and thousands of locations in ISO 3166-3. Styles and Authorities may resort to hundreds of styles and millions of authorities (I actually think all the users). This cannot be registered. We therefore need a semantic everyone/process can use and understand and register only once it is needed for a particular reason. This is why this semantic must permit to support inheritances, synonyms and aliases, etc. Another important point to keep in mind is that the language tags will not only permit to indicate the language a reader must be familiar with to understand a Web page. It will be used to designate the whole set of charset tables, semantic and grammar rules, dictionaries, equivalences etc. in a WordProcessor (this is in Randy's proposed charter and this conforms to reality). This defines _an_ authority for the used language: if the langtag does not permit an authority descriptor, the language will become exclusive to the one who registered it (the pending registrations at the IANA are by only two industry leaders). If the web and the word processors, hence the scanners, use the same language CD - everyone using the language will have to conform to that CD to be read, scannered, processed, printed. This cannot be. And none of us do it. When I speak of individual authority granularity and style, I only conform to the most used word processor. Open your Word, go to tools and options and click on grammar. You will be asked what you want to add to the "language, script, country" inheritance you defined through the chosen "Language" (tools, language): you can chose your dictionaries, making of you/the source you chose the authoritative reference; and the style you chose (in using all the parameters, you can defines thousands different styles). Yet Word is more used through the world than XML, and it works for decades now. I thank you for your attention jfc