Return-Path: Received: from murder ([unix socket]) by eikenes.alvestrand.no (Cyrus v2.2.8-Mandrake-RPM-2.2.8-4.2.101mdk) with LMTPA; Wed, 06 Jul 2005 07:45:15 +0200 X-Sieve: CMU Sieve 2.2 Received: from localhost (localhost.localdomain [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id A032F61AFD for ; Wed, 6 Jul 2005 07:45:15 +0200 (CEST) Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 10800-04 for ; Wed, 6 Jul 2005 07:45:09 +0200 (CEST) X-Greylist: domain auto-whitelisted by SQLgrey-1.4.8 Received: from megatron.ietf.org (megatron.ietf.org [132.151.6.71]) by eikenes.alvestrand.no (Postfix) with ESMTP id 34A9161AFB for ; Wed, 6 Jul 2005 07:45:09 +0200 (CEST) Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1Dq2hO-0003S9-Rb; Wed, 06 Jul 2005 01:43:54 -0400 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1Dpwxa-0005yQ-76 for ltru@megatron.ietf.org; Tue, 05 Jul 2005 19:36:15 -0400 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id TAA20070 for ; Tue, 5 Jul 2005 19:34:41 -0400 (EDT) Received: from montage.altserver.com ([63.247.74.122]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1Dpx8h-0001nX-KO for ltru@ietf.org; Tue, 05 Jul 2005 19:47:44 -0400 Received: from ver78-2-82-241-91-24.fbx.proxad.net ([82.241.91.24] helo=jfc.afrac.org) by montage.altserver.com with esmtpa (Exim 4.44) id 1Dpwhg-0006l4-AI; Tue, 05 Jul 2005 16:19:50 -0700 Message-Id: <6.2.1.2.2.20050706001224.05117b90@mail.afrac.org> X-Mailer: QUALCOMM Windows Eudora Version 6.2.1.2 Date: Wed, 06 Jul 2005 01:18:59 +0200 To: "Dylan N. Pierce" , ltru@ietf.org From: r&d afrac Subject: Re: [Ltru] Private Use Tags In-Reply-To: <42CB03D4.20801@megared.net.mx> References: <42CB03D4.20801@megared.net.mx> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; format=flowed X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - montage.altserver.com X-AntiAbuse: Original Domain - ietf.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - afrac.org X-Scan-Signature: 453b1bfcf0292bffe4cab90ba115f503 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by ietf.org id TAA20070 X-Mailman-Approved-At: Wed, 06 Jul 2005 01:43:52 -0400 Cc: X-BeenThere: ltru@lists.ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Language Tag Registry Update working group discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ltru-bounces@lists.ietf.org Errors-To: ltru-bounces@lists.ietf.org X-Virus-Scanned: amavisd-new at alvestrand.no Dear Dylan, this request makes a lot of sense. There are many issues there. A few one= s. 1. you characterise in a way the use of a document. I am not sure this=20 directly fits with the characterisation of a language. But it characteris= e=20 a relation channel. I mean by that the way the document is intended to be= =20 received, send or exchanged, and from there classified. Today the propose= d=20 Draft leaves this undefined in at least two ways: - first paragraph. It says "Human beings on our planet have, past and=20 present, used a number of languages. There are many reasons why one woul= d=20 want to identify the language used when presenting or requesting=20 information.". One could say that "used" may relate to exchanged,=20 "presenting" to send and "requesting" to received, with some variation=20 because for example requesting does no mean that it was received. - part 2. "The language tag always defines a language as used (which=20 includes being spoken, written, signed, or otherwise signaled) by=20 human beings for communication of information to other human beings.=20 Computer languages such as programming languages are explicitly excluded.= " The problem here is that a language is not defined (what it is? how is it= =20 identified? etc.) however the langtag is normative of that something. The= =20 usage of the proposed langtags can only be subjective (the perception of=20 the users) and the discussed language to be rather undefined concepts. Yo= ur=20 proposition creates languages _values_ (the instanciation of the Kentucky= =20 press). I am not sure you can really qualify it within the Draft framewor= k.=20 This is because it is filtered by a media (Kentucky press) and not by a=20 speakers community (unless you mean the readers - or the authors? - of t= he=20 Kentucky press). Please recall that they do not want to accept man/comput= er=20 and computer/computer languages. This obviously creates a classification=20 problem with StarWars, what is H2D2 speaking? and Yoda who is not a human= =20 being? The Japanese Fair of Robotics this years, shown Dro=EFds interrela= ting=20 in Japanese or in English. There is also a vacuum for computer generated=20 texts, alarms, etc. Would you introduce "r" for Robots? but that would=20 oppose the spirit of the Draft? 2. you want to permit organisations and persons to define their personal=20 name space (cf. John Klensin recent Draft on IANA) and define their own=20 format. This was recently proposed and denied. The first problem you meet= =20 with this is size of the namespace you need and its structure. You consid= er=20 that Microsoft would register "mcrsoft", why that? Mr. Sungil Yoon who ow= ns=20 McrSoft.com has the right to use it. You will say that "Microsoft" is=20 longer than 8alpha. Right, but RFC 1766 said that you cannot change that=20 and is to be respected by consensus of this WG. Obviously you can object = to=20 this consensus, I will too, others will probably too and this will not be= =20 anymore a consensus. But short of that, only user owning universal rights= =20 (every class, every country) can claim a tag (like "mercedes" or=20 "cocacola") otherwise we have conflicts. I note that RFC 2860 can also create a problem for the Draft. Your naming= =20 part is by essence an ICANN part of the IANA: the Registrar and Examiner=20 must be designated by the ICANN BoD and appeals probably subject to GAC.=20 This should be reflected in the Draft. Is this really what you want? Another problem you may not have investigated is that Microsoft could hav= e=20 different branches and needs, for example "microsoft.corp", "microsoft.us= ",=20 etc. the languages spoken in its different branches being certainly=20 different. This is why we have introduced three warnings you can find on=20 http://rfc3066.org: - there is no subtag size limit in the x-tags part - the "." and the ":" are accepted characters, "." introducing a comment = or=20 an additional part and ":" permitting to use URNs? For us=20 "x-en.microsoft.us-Latn-de" qualifies the language of a Mr. Gates visitin= g=20 Germany. But please note that, if this proposition, initially presented by F.=20 Charles, cannot conflict with any previous format since it is a private=20 area, it is not supported by this WG, what is deemed to have consensually= =20 opposed, (one or two objections). 3. you consider the notions of referent (ex.: commonly accepted reading=20 level -L-6) and context (to know about the Kentucky cities and life). The= se=20 are two levels which are very important to the support of a relation=20 (together with their dates - as per ISO 11179 - to know which version is = to=20 be used). Other referents can be Dictionaries, Grammars, publishers, etc.= =20 Other contexts can be style, mimics, accents, etc. These notions are most= =20 probably too complex for the Draft and can be multiplied and need=20 priorities in case of conflicts when two referential systems have differe= nt=20 descriptions. Please accept that the Draft only supports one single mode=20 (script) and has no provision (yet) to support other modes. All this can/should certainly be supported. But this would call for a=20 general framework introduction of language support (BCP 47) within the=20 Internet architecture, as a continuation/extension of the RFC 3066. In th= is=20 case the Draft would be an application of this framework. The sentence=20 "This document replaces RFC 3066" should then be replaced by "this docume= nt=20 complements RFC 3066": this is a part of the debate over the Charter, th= is=20 WG consensus does seem to want to engage. 4. you say you do not consider that using domain names would be adequate,= =20 but you do not document it. This is one of the solutions to support=20 individual/avatars and contexts grids. I would therefore be interested yo= u=20 document your position. This is a point which is hotly debated in some IS= O=20 committees, and belongs to what is qualified as the "pulverisation" of a=20 user-centric Internet (i.e. its ultimate granularity). Work currently=20 carried one coreboxes and OPES (WG-OPES) go down to this degree and even=20 below (the individual relation level and context: the way you speak when=20 you are with someone else specific, under some identified circumstances.=20 ex: the language you use with a cop who stopped you on the road). Thank you for this interesting thinking. jfc At 00:04 06/07/2005, Dylan N. Pierce wrote: >(This is a re-send of an e-mail I originally sent to the authors of a=20 >previous draft; I have since been educated as to the proper way to comme= nt.) > >Dear Mr. Phillips and Mr. Davis, > >First, please forgive me if I'm not following proper procedure in=20 >commenting on this draft; while I do have a strong programmer's interest= =20 >in this standard, I admit that I'm not typically a participant in these=20 >procedures and haven't thoroughly educated myself on the policies for=20 >submitting comments. > >I would like to recommend an addition to this draft, for which I think I= =20 >can make a rather compelling case based on hypothetical but quite=20 >reasonable scenarios. Personally, I hope very much that your draft becom= es=20 >a standard, as the problems with a canonical parsing of current RFC 3066= =20 >language tags are well-known and bothersome to developers everywhere. Yo= ur=20 >draft strikes me as an excellent way to finally standardize the practice= =20 >in a way which will be accessible to all developers without having to=20 >investigate thirty different standards and documents from ten different=20 >organizations. > >Regarding Section 3.4 on extensions and extension namespace: You already= =20 >have here a mechanism in place for extending this specification. I would= =20 >like to suggest an extension which should probably be incorporated into=20 >the main specification. I believe you should define an "organization=20 >convention" extension for use by private companies and organizations for= =20 >their own purposes. > >I realize that a "private use" extension is already defined in section=20 >2.2.7. However, I maintain that the private use extension is not=20 >sufficient for potential development and interdevelopment among importan= t=20 >organizations, as there is no way a parsing agent could assume anything=20 >significant about the tags which follow. And yet, the registration of 3.= 4=20 >extensions is also insufficient because, frankly, you'll rapidly run out= =20 >of letters if you make a sincere effort to define namespace for private=20 >companies and organizations. > >Let's take a concrete example. Let's say that the American Library=20 >Association (ALA) decides to define an extension to help them classify=20 >books by reading level. As your specification stands, they have two=20 >choices: they can register a 3.4 extension (we'll say they register "L")= =20 >and then use their subtags as follows: > >en-US-L-g6: A book written in English as spoken in the United States at=20 >the sixth-grade reading level. > >The ALA would have excellent reasons for wanting such a tag, as it would= =20 >greatly facilitate the database querying and transfer of material to=20 >public schools. > >However, we see the first problem: the ALA has their tag, which many=20 >schools would use. Then, Associated Press would want their tag to indica= te=20 >regional assumptions. We'll give them "P" (for "press"): > >en-US-P-ky: An article written in English as spoken in the United States= =20 >which assumes readers are already familiar with names, cities, politics,= =20 >etc., in Kentucky. (They would use this to distribute versions to Kentuc= ky=20 >press where they don't have to explain that Frankfurt is the capital,=20 >distinguishing them from national or international versions which would=20 >make no such assumption and explicitly specify that Frankfurt is the cap= ital.) > >If we keep up like this, as I mentioned, we'll rapidly run out of=20 >singleton letters. Everyone will want one, some for valid reasons, other= s=20 >for silly reasons, and then your registration authority would be in the=20 >unenviable position of having to make value judgments regarding what is=20 >valid and what is silly, given such limited real estate. > >Furthermore, you'll be putting the organizations themselves in a difficu= lt=20 >position. For example, if the ALA decides to modify their convention, th= is=20 >is something that is only of interest to them and the people who use the= ir=20 >specification. However, in order to make their own internal changes, the= y=20 >will technically have to go through the entire process of revising a=20 >stable specification through the registration authority (according to 3.= 4,=20 >which requires stability and canonical representation), something which = is=20 >never recommendable. > >And finally, parsing agents which have no interest in the ALA's tag (whi= ch=20 >will be most of them) will nonetheless have the burden of checking confo= rmance. > >If we take the other approach, and say, "We have the 'x' tag for private= =20 >use. The ALA and AP can take that tag and follow it up however they want= ,"=20 >then we're creating another problem. All of the parsing agents which do=20 >have an interest in those tags cannot be guaranteed that they mean what=20 >they think they mean. > >For example, if the ALA decides to go with: > >en-US-x-ala-g6 > >But subsequently the Associate Press decides that their private tag=20 >"x-ala" means articles of interest to Alabamans, then what's the ALA do = to=20 >when they want to classify articles written by the AP? The problem is th= at=20 >parsing user agents will be unable to assume anything about the tag that= =20 >follows, and once a conflict occurs, both tags become either useless, or= =20 >subject to the type of interpretation that a human might perform easily=20 >but a machine cannot. > >The solution is simply to define an organizational namespace. We take a=20 >random tag--we'll say "P" for private--and then allow companies and=20 >organizations to register their own namespace. Everything that follows=20 >their namespace tag is then interpreted according to their standard,=20 >whatever that may be. For example, the ALA would register "ala," the AP=20 >would register "ap," Microsoft would register "mcrsoft," Adobe would=20 >register "adobe" and so on. > >Then, anyone seeing a tag like this: > >en-US-P-ala-g6 > >could know unambiguously that whatever follows the P-ala is to be=20 >interpreted by the ALA's own convention, whatever that might be. Each=20 >registering organization could then be responsible for the stability and= =20 >canonical representations of their own namespace without affecting the=20 >stability of the specification as a whole. > >Parsing agents which are not interested in the AP's tags simply knows to= =20 >ignore anything after the "P" tag that isn't an organization in which it= =20 >has an interest. Parsing agents that are interested can now know with=20 >assurance that the information is what they're looking for. Companies an= d=20 >organizations can establish their own standards which can easily evolve = to=20 >suit their needs. Private companies can establish compatibility standard= s=20 >between themselves which won't affect the specification as a whole. > >This could be infinitely extensible merely by setting aside one of the=20 >organizational tags to mean "check the next set." For example, if the=20 >American Library association registers "ala" as above, and then later th= e=20 >Association of Libertarians and Anarchists shows up, finds that all the=20 >mnemonic representations of their name are already used and there's not=20 >much space left on the registery (and with 368 alphanumeric possibilitie= s,=20 >that's not likely, but let's pretend), they could define their namespace= =20 >as "set2-ala" (assuming we've already decided that "set2" is the tag whe= n=20 >means "check the next set"). > >This allows all companies and organizations which have a need to define=20 >their own namespaces and then use them as the needs of their particular=20 >domain indicate in a way that is nonetheless unambiguously established f= or=20 >parsing agents which can then make error-free decisions about whether or= =20 >not the information which follows is useful to their needs, all done=20 >without sacrificing the stability of the main specification. > >This is the extent of my speculation on the issue. I did consider the=20 >possibility of using Java-package-name-like identifiers tied to domain=20 >registration, so that Microsoft could have the "com-microsoft" tag and t= he=20 >ALA could have the "org-ala" tag, but this would end up violating the=20 >eight-character rule and allow just any yahoo with a website to include=20 >whatever he sees fit (en-US-com-sexychicks-38D comes to mind), which I=20 >don't think is a desirable solution at all. > >If you have found this comment at all useful, I would appreciate hearing= back. > >Sincerely, >Dylan N. Pierce >IT Coordinator, TykeTek > >TykeTek/Diapositivas Gloria >Salvador Quevedo y Zubieta #821 Int. 6 >Col. la Perla >C.P. 44360 Guadalajara, Jal. >MEXICO > >E-Mail: dylanpierce@megared.net.mx >Telephone: +52 (33) 3617.3660 >Cellular: +52 (33) 1149.7057 > >_______________________________________________ >Ltru mailing list >Ltru@lists.ietf.org >https://www1.ietf.org/mailman/listinfo/ltru > _______________________________________________ Ltru mailing list Ltru@lists.ietf.org https://www1.ietf.org/mailman/listinfo/ltru