Return-Path: Received: from murder ([unix socket]) by eikenes.alvestrand.no (Cyrus v2.2.8-Mandrake-RPM-2.2.8-4.2.101mdk) with LMTPA; Sun, 17 Apr 2005 17:56:15 +0200 X-Sieve: CMU Sieve 2.2 Received: from localhost (localhost.localdomain [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id AD3D961B4E for ; Sun, 17 Apr 2005 17:56:15 +0200 (CEST) Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 05237-05 for ; Sun, 17 Apr 2005 17:56:10 +0200 (CEST) Received: from megatron.ietf.org (megatron.ietf.org [132.151.6.71]) by eikenes.alvestrand.no (Postfix) with ESMTP id 0737161AF5 for ; Sun, 17 Apr 2005 17:56:10 +0200 (CEST) Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1DNC6l-0007k7-Q9; Sun, 17 Apr 2005 11:54:51 -0400 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1DNC6j-0007jc-Tr for ltru@megatron.ietf.org; Sun, 17 Apr 2005 11:54:50 -0400 Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA01522 for ; Sun, 17 Apr 2005 11:54:46 -0400 (EDT) Received: from [63.247.76.195] (helo=montage.altserver.com) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DNCHN-0001Ii-3P for ltru@ietf.org; Sun, 17 Apr 2005 12:05:49 -0400 Received: from lns-p19-19-idf-82-249-4-172.adsl.proxad.net ([82.249.4.172] helo=jfc.afrac.org) by montage.altserver.com with esmtpa (Exim 4.44) id 1DNC6a-000411-MV; Sun, 17 Apr 2005 08:54:41 -0700 Message-Id: <6.1.2.0.2.20050417140533.042117c0@mail.jefsey.com> X-Sender: jefsey+jefsey.com@mail.jefsey.com X-Mailer: QUALCOMM Windows Eudora Version 6.1.2.0 Date: Sun, 17 Apr 2005 17:54:36 +0200 To: "Doug Ewell" , "LTRU Working Group" From: "JFC (Jefsey) Morfin" Subject: Re: [Ltru] Re: Proposed Text for Moving Forward In-Reply-To: <00d201c54324$350179e0$030aa8c0@DEWELL> References: <20050416214852.QWXX4543.mta4.adelphia.net@megatron.ietf.org> <00d201c54324$350179e0$030aa8c0@DEWELL> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - montage.altserver.com X-AntiAbuse: Original Domain - ietf.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - jefsey.com X-Scan-Signature: 8fbbaa16f9fd29df280814cb95ae2290 Cc: X-BeenThere: ltru@lists.ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Language Tag Registry Update working group discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ltru-bounces@lists.ietf.org Errors-To: ltru-bounces@lists.ietf.org X-Virus-Scanned: amavisd-new at alvestrand.no I am embarassed: I just started testing that I only need to support a point to rise a clever opposition. I will however note that this seem to confirm that all this discussion results from a conflict between trying to document general semantic rules (provide information on script, language, region, referent and style) and wanting to adapt them to the legacy of several particular applications. I understand these past-centric motivations, but they should not be our leading ones. I can only repeat my proposition which is to word the Draft as a framework for specific solutions, documenting by yearly RFCs the content of the core Registries, and to jointly address - on a case per case (application per application) basis - the format/filtering issues using these Registries in an ad-hoc maner - including some online IANA services when needed. jfc At 10:05 17/04/2005, Doug Ewell wrote: >Addison Phillips wrote: > > > I agree, Mark, that the full effect can be achieved with only one > > field and that your proposal is superior in a number of regards (fewer > > moving parts, ease of maintenance, ease of application). > > > > I proposed two, though, for a reason. One of the objections was that > > we didn't document when a particular script really ought to be used > > (i.e. that you really should start to use zh-HanX-XX in preference to > > zh-XX). > >I know I said I would back off and not get involved in the >default-script issue. OK, so I lied. Sorry about that. > >AFAICT, this whole issue started with the concern that people would use >a script subtag in cases where it was generally thought to be (a) >unnecessary, because the intended script would be obvious, and (b) >undesirable, because it would interfere with left-prefix (RFR) matching. > >The standard example was "en-Latn-US." The case was made that the >overwhelming majority of written U.S. English text is written in the >Latin script, so the added flexibility of being able to specify the >script would be largely unnecessary, and in particular it would be >overshadowed by the inability of existing left-prefix matching >algorithms to match "en-Latn-US" with "en-US" (sometimes generalized to >"broken backward compatibility"). > >This was the foundation of "default script": certain languages like >English could be listed as having a default script of Latin, so that tag >generators could avoid creating tags like "en-Latn" or "en-Latn-US" >whose disadvantages would outweigh their advantages. > >Of course, in certain circumstances you might have English written in >Braille, or even in Cyrillic, and most if not all seemed to agree that >in these rare circumstances it would be acceptable to generate >"en-Brai-whatever" or "en-Cyrl-whatever." > >The standard counterexamples were "zh-Hans" and "zh-Hant." The case was >made that Chinese is commonly written in both of these script variants, >and it would often be beneficial to include script information, to the >point of perhaps being more important than strict compatibility with >left-prefix matching algorithms. Languages like Chinese and Azerbaijani >and Serbian, after all, were the major use cases for the introduction of >script subtags in the first place. > >So unlike "en-Latn", it would not be discouraged to write "zh-Hans" or >"zh-Hant", or either of these followed by a region subtag. Of course, >just like English, Chinese could also be written in a less obvious >script like Braille or Cyrillic, and so "zh-Brai" and "zh-Cyrl" ought to >be allowed as well. > >I concede that because of the stated predominance of processes that use >left-prefix matching, it might be beneficial to define a default script >for common languages that are written in a single script 99.9% of the >time. I still don't know where the authority comes from to decide which >languages and which scripts get marked in this way -- definitely not >from ISO or documented registrations or deterministic rules, like >everything else in the registry -- but I assume that would be worked out >in due course. > >What I do NOT understand is how this has expanded to telling people when >they SHOULD use script subtags, and how the set of allowable subtags >should be limited in some way. > >There may well be cases where "zh" or "zh-CN' or "zh-TW" is all that is >needed, and there is certainly existing data that uses such tags. I >don't see any justification for discouraging such usage, even if we have >defined a way to tag Chinese data more precisely. Likewise, if there is >no prohibition against writing "en-Brai" or "en-Cyrl", then I see no >reason to prohibit or discourage "zh-Brai" or "zh-Cyrl" either. A >"required-script" field would do exactly this, by listing 'Hans' and >'Hant' but not others. > >This is too prescriptive. It tells people how they SHOULD tag data, not >just in terms of "tag content wisely" or "don't be excessively precise," >but on a specific language-by-language basis. It assumes, implicitly, >that this group or ietf-languages has the expertise and authority to >make this judgment. Unlike default-script, required-script does nothing >to solve the left-prefix matching problem, and as such, I don't think >it's within the scope of the charter. > >I propose the following: > >1. An optional, informative default-script field that would suggest to >tag generators that they not use that particular script subtag together >with that particular language subtag. This field could be added, >changed, or removed at any time. (It doesn't matter much what the field >is called, and I renew my suggestion that we not try to inject too much >deep meaning into the names of fields, or assume that users will derive >deep meaning from them.) > >2. NO requirement within the draft that tag generators "must not" use >script subtags in any given scenario. The text in draft-01 that >discourages the use of a script subtag "unless it conveys additional >information" should be adequate. > >3. NO mechanism to tell tag generators that they "should" use a script >subtag together with any particular language subtag, and *especially* >not one that lists the "expected" script subtags while excluding others. >If tag generators opt to create a tag such as "zh-TW" that "may be >ambiguous without script information," that should be up to them. > >-Doug Ewell > Fullerton, California > http://users.adelphia.net/~dewell/ > > > >_______________________________________________ >Ltru mailing list >Ltru@lists.ietf.org >https://www1.ietf.org/mailman/listinfo/ltru _______________________________________________ Ltru mailing list Ltru@lists.ietf.org https://www1.ietf.org/mailman/listinfo/ltru