Return-Path: Received: from murder ([unix socket]) by eikenes.alvestrand.no (Cyrus v2.2.8-Mandrake-RPM-2.2.8-4.2.101mdk) with LMTPA; Sun, 10 Apr 2005 02:09:03 +0200 X-Sieve: CMU Sieve 2.2 Received: from localhost (localhost.localdomain [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id 07C2661B62 for ; Sun, 10 Apr 2005 02:09:03 +0200 (CEST) Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 29176-02 for ; Sun, 10 Apr 2005 02:08:57 +0200 (CEST) Received: from megatron.ietf.org (megatron.ietf.org [132.151.6.71]) by eikenes.alvestrand.no (Postfix) with ESMTP id AE7C661B50 for ; Sun, 10 Apr 2005 02:08:56 +0200 (CEST) Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1DKPzc-0000ci-Qt; Sat, 09 Apr 2005 20:08:00 -0400 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1DKPza-0000cd-Ms for ltru@megatron.ietf.org; Sat, 09 Apr 2005 20:07:58 -0400 Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id UAA18575 for ; Sat, 9 Apr 2005 20:07:55 -0400 (EDT) Received: from [63.247.76.195] (helo=montage.altserver.com) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DKQ8e-0004aE-N1 for ltru@ietf.org; Sat, 09 Apr 2005 20:17:21 -0400 Received: from lns-p19-8-idf-82-249-30-81.adsl.proxad.net ([82.249.30.81] helo=jfc.afrac.org) by montage.altserver.com with esmtpa (Exim 4.44) id 1DKPzW-0002Te-OD; Sat, 09 Apr 2005 17:07:57 -0700 Message-Id: <6.1.2.0.2.20050410013833.03f10490@mail.jefsey.com> X-Sender: jefsey+jefsey.com@mail.jefsey.com X-Mailer: QUALCOMM Windows Eudora Version 6.1.2.0 Date: Sun, 10 Apr 2005 02:07:29 +0200 To: "Addison Phillips" From: "JFC (Jefsey) Morfin" Subject: Re: [Ltru] seeking resolution of the Great Script Debate In-Reply-To: <634978A7DF025A40BFEF33EB191E13BC0AFA3A97@irvmbxw01.quest.c om> References: <634978A7DF025A40BFEF33EB191E13BC0AFA3A97@irvmbxw01.quest.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - montage.altserver.com X-AntiAbuse: Original Domain - ietf.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - jefsey.com X-Scan-Signature: 932cba6e0228cc603da43d861a7e09d8 Cc: ltru Working Group X-BeenThere: ltru@lists.ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Language Tag Registry Update working group discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ltru-bounces@lists.ietf.org Errors-To: ltru-bounces@lists.ietf.org X-Virus-Scanned: amavisd-new at alvestrand.no Addison, I may be dumb stupid. But I see that you honnestly document a problem which results from a change in the format of the entry in a function. I have some difficulty understanding why do you want to do that? What you document is that the current functions work well with the current tags. But if you change them to work well with the new tags, they will not to work well anymore with old tags what seems to make sense. Why not just to do as Jesus said: to have new functions for new tags and to keep old functions for old tags? Is this a very big problem for W3C to say that xml:lang will keep using the old tags format and that xml:nlang will use a new format built from experience? Since you say libraries will have to be updated and no one is using yet your new format, the update would only to add nlangtag support to existing langtag support? By the same token you could also announce the support of xml:xlang to support the extended language tag we need. Question (I am not at all an XML person): - would this be a big problem if the xml:*lang tag was an URL? - would this be a big problem if the language parameters were documented in another more extended way? jfc On 00:20 10/04/2005, Addison Phillips said: >content-class: urn:content-classes:message >Content-Type: text/plain; > charset="utf-8" > >It seems to me that in order to achieve a resolution of this issue, we >need to take a step back and look concretely at the problem. > >This message is going to be somewhat long, since I'm going to look at both >sides of the problem in detail. > >Let's start with the compatibility problem. The claim of Ned, Ira, and >others is that 3066bis "breaks" existing implementations. I think that >word "breaks" is problematic, because it does not describe accurately what >happens. No implementations actually crash when you send them tags that >they don't recognize (those that do are beyond consideration here). What >we mean is that the results produced are either not what the user expected >or are different than what the user previously received for the same request. > >There are two kinds of implementation that we've been holding up for >examination. I'll take them in turns. > >First are pure RFC 2616 matching implementations which don't care about >the contents of a particular subtag. This is equivalent to what 3066bis >calls a "well-formed" processor. Common examples of these include >xml:lang, CSS 2.1, Apache's language negotiation mechanism and Kurt's LDAP >RFC. In these implementations, it doesn't matter if you send them a tag >like "foo-bar-baz-gleep" or "zh-Hant-CN". All that matters is the matching >of each token or subtag in order. > >For these implementations, script subtags pose a problem if they are >inconsistently used for a given language prefix. That is, if I sometimes >request "xx-Latn-CC" and sometimes request "xx-CC" or if I sometimes tag >content using "xx-Latn-CC" and sometimes as "xx-CC", then I will >experience problems with matching the "xx-Latn" prefix to "xx-CC" and vice >versa. The approach 3066bis takes here is to suggest that content authors >and requesters be systematic in using or not using script subtags. > >Let us pause and recognize that there are many implementations of this >nature and also recognize that this is not by any means "all" >implementations either. > >Second are the kinds of implementations Ned Freed has described, which DO >care about the contents of specific subtags. In an RFC 3066bis >implementation, these are validating processors. A well-known example that >I'll use as a proof-of-concept is the ServletRequest.getLocale method of >J2EE. This method takes Accept-language tags in the HTTP header and >attempts to match them to java.util.Locale objects predefined in the Java >runtime environment. There are roughly 150 of these in a JRE (less, >usually, and the number varies by JRE) in the form language_region_variant. > >There are four ways that these implementations may react to a tag in the >form "xx-Latn-CC". First, they may find the language and region code and >ignore interstitial subtags (producing xx-CC in our example). Second, they >may find as many subtags in order until the meet one they don't recognize >(producing xx in our example). Third, they may reject the whole tag as >unrecognized. Fourth, they may assign wrong values to the wrong fields >(which produces the same results in many ways as option the second). > >Any of these is a valid reaction. > >It is certainly possible in the J2EE case for a user to set up all of >their content and resource bundles to work "correctly" with script >subtags, but it is significant work to do so. In addition, Locale objects >created with a script in either the region or variant slot will not work >correctly and probably return the default language for the particular >configuration or JRE. > >This is not an insignificant problem. > >In fact, ironically, I have a demo of this on my own website here: > >http://www.inter-locale.com/LocalesDemo.jsp > >Only... I wrote the AcceptLanguageBean class that powers this demo around >my RFC 3066bis implementation and it gets the right answer for >"zh-Hant-TW" (try it yourself). > >So I wrote one that uses pure J2EE here: > >http://www.inter-locale.com/bis.jsp > >It produces: > >Lang = zh >Region = Hant >Country = TW > >This is essentially the same thing as saying "zh" in terms of results. > >There isn't much that we can tell users to do with their content tags or >requests that will ameliorate this mismatch, since the implementation is >using values in the subtags to perform some kind of mapping or processing. > >This problem (in this example, but not, please note, all possible >examples) could be addressed by moving the script subtag down in the order. > >The problem is the mismatch between RFC 2616 matching's needs (see Misha's >email) and this J2EE-style matching's needs. > >Addison > >Addison P. Phillips >Globalization Architect, Quest Software >http://www.quest.com > >Chair, W3C Internationalization Core Working Group >http://www.w3.org/International > >Internationalization is not a feature. >It is an architecture. > >_______________________________________________ >Ltru mailing list >Ltru@lists.ietf.org >https://www1.ietf.org/mailman/listinfo/ltru _______________________________________________ Ltru mailing list Ltru@lists.ietf.org https://www1.ietf.org/mailman/listinfo/ltru