Return-Path: Received: from murder ([unix socket]) by eikenes.alvestrand.no (Cyrus v2.2.8-Mandrake-RPM-2.2.8-4.2.101mdk) with LMTPA; Thu, 07 Jul 2005 03:38:11 +0200 X-Sieve: CMU Sieve 2.2 Received: from localhost (localhost.localdomain [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id 8DC9E61B80 for ; Thu, 7 Jul 2005 03:38:11 +0200 (CEST) Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 26399-07 for ; Thu, 7 Jul 2005 03:38:07 +0200 (CEST) X-Greylist: domain auto-whitelisted by SQLgrey-1.4.8 Received: from megatron.ietf.org (megatron.ietf.org [132.151.6.71]) by eikenes.alvestrand.no (Postfix) with ESMTP id 8C54961B4D for ; Thu, 7 Jul 2005 03:38:06 +0200 (CEST) Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1DqLJK-00022Y-Ni; Wed, 06 Jul 2005 21:36:18 -0400 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1DqLJJ-00021d-Kq for ltru@megatron.ietf.org; Wed, 06 Jul 2005 21:36:17 -0400 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id VAA23089 for ; Wed, 6 Jul 2005 21:36:15 -0400 (EDT) Received: from montage.altserver.com ([63.247.74.122]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1DqLkO-0003f1-T5 for ltru@ietf.org; Wed, 06 Jul 2005 22:04:17 -0400 Received: from ver78-2-82-241-91-24.fbx.proxad.net ([82.241.91.24] helo=jfc.afrac.org) by montage.altserver.com with esmtpa (Exim 4.44) id 1DqLJD-0002G3-7e; Wed, 06 Jul 2005 18:36:11 -0700 Message-Id: <6.2.1.2.2.20050706232617.05468ae0@mail.afrac.org> X-Mailer: QUALCOMM Windows Eudora Version 6.2.1.2 Date: Thu, 07 Jul 2005 03:36:04 +0200 To: "Dylan N. Pierce" , ltru@ietf.org From: r&d afrac Subject: Re: [Ltru] Private Use Tags In-Reply-To: <42CC0B62.9030802@megared.net.mx> References: <42CB03D4.20801@megared.net.mx> <6.2.1.2.2.20050706001224.05117b90@mail.afrac.org> <42CC0B62.9030802@megared.net.mx> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - montage.altserver.com X-AntiAbuse: Original Domain - ietf.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - afrac.org X-Scan-Signature: 96d3a783a4707f1ab458eb15058bb2d7 Cc: X-BeenThere: ltru@lists.ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Language Tag Registry Update working group discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ltru-bounces@lists.ietf.org Errors-To: ltru-bounces@lists.ietf.org X-Virus-Scanned: amavisd-new at alvestrand.no Dear Dylan, thank you for your response. I think we are in agreement on many things. But - you are a programmer - you know that from a concept to a program there must be an analysis and a development. Also, there is a added complexity which is that we are the IETF and not Open Source or Unicode. This means that we have to deal with networks, what means asynchronous interactions in real time with unknown final agents. There are simple rules from experience which help through this complexity which have been summarized by the present IETF Chair in RFC 1958. It is really worth reading and try to think that way and if possible improve it. Obviously you add your experience. I see you are following this. There are three rules which are basic for me: (a) except the rule which says that, everything can change (b) kiss, keep it simple, stable, stupid, etc. (c) scalability which means that you must be able to apply the theory of that rule everywhere it can apply and it will work, a correlative one is that you should not solve two things which may become similar in two different ways. In a nusthell this means open consistency: open, everything is possible, consistency, one single idea should apply because sometime, somewhere you will have a conflict. This is basically what you say everywhere. There is however a difference between what you say about programing and what I say about architecture: just think a step further, because here we are normating. So you _are_ to ask "why?" and not "are you sure", but "if we want that, we all have to do it that way for greatest efficiency": you aree like the client of your client. You build the envirnonment he will use to ask you to develop an application. At 18:48 06/07/2005, Dylan N. Pierce wrote: >In response to jfc (forgive me if I don't yet understand how to ensure >that my e-mail appears in the thread below the appropriate post; I freely >admit this is my first time participating in a procedure such as this). > >Whenever I read a standard, for better or for worse, the question I am >asking myself isn't, "What does this mean?" or "What purpose does this >serve?" I am always asking myself, "How do I write a program that does >this?" This is why RFC 3066, for all that it's a BCP, simply is >inadequate; I am interested enough in this working group to come here and >express a solid support for the current direction because having a >language tag which is parsable according to constructable rules greatly >reduces the amount of work any programmer has to do when developing for >compliance. The problem specific to this group was decscribed by Addison. He says and we all agree: RFC 3066 is too imprecise. There are two responses to that: 1. people like me who say (for my applications) "great, I am free" 2. people like Addison (for his XML or Unicode, etc.) and you for you applications who say "gee, I am lost". The role of the Draft is to address (2) while protecting (1). Now, you say: (3) "great, but I want more", and the role of the Draft should be to provide more. Now, you see we have a problem, because the Draft is expected to provide (1), (2) and (3) and provides only (2). What the Internet community (charter) expects from a BCP 47 is to address the problems of RFC 3066, which only provided (1). This can be done in two ways: 1. in writing a language tags framework for the Internet, of which the Draft will support (2) and will be ready to support as many (3) as you need. 2. because much work has been spent on (2) and that x-tags permit to support other formats and explore (3), to keep (1) and add (2) in just changing "replace RFC 3066" by "complement RFC 3066". Then we can all freely work. >As such, I read your first point, regarding characterizing the use of a >document, and it tells me something interesting about myself. The fact is, >for all its irony, I'm not typically even remotely interested in how a >document is used; I tend to focus on how a document is /filed/ by the >people who use it. Then you are more a programer (an author) than a networker. However, please note that you use the terms "people who use it". You do not say "people it is intended for". So you consider some action by the "non authors". >If a client tells me, "I want a list of all documents sorted >alphabetically by the third sentence of the second chapter," I might ask, >"Why? Are you sure?" but if the client insists, I'll dutifully begin >writing the appropriate algorithsm. > >I admit that issues of defining "What is a language?" and its >philosophical correlates are perhaps of coffee-table interest to me, but >not of professional interest as a programmer. I am not sure of that !!! Identfying a language is a very complex bit a programing. The language will be defined by statistical rules, etc. Look for example the orthographic correction of Word, which support multilanguage text and considers the language of the paragraph. This means a very interesting piece of code to decide that a paragraph is French and the next one is English. But, mainly,as a networker you see a language is an interaction protocol. No basic conceptual difference between http and English and C. But they can be used in many different modes and over many different media. Now, as a programer you become all the sudden very interested when due to that the language is not going to be the same depending on the mode or the media because the mode or the media is going to interfere with the restitution or because you will be able to compact exchanges (what you do for example with an error message). >Instead I merely want to know how I serve to any random end-user the >appropriate document following whatever language /he/ thinks he's >speaking. This means I need my language tags to be /descriptive,/ not >proscriptive, and they have to be extensible in a logical way. Descriptive/prescriptive have a meaning only in a context of use and by which partner of the exchange at which point of the exchange. If I describe what is necessary it becomes prescriptive. Full agreement with the extensibility requirement. >For better or for worse, if we are describing human languages, we must >deal with the reality that human beings /do/ invent languages. Yes all the time. And the worst of it is they invent it for machines too. If I tell you printf("you %s\n", "right"). What is it? > It /is/ possible to find websites written in Klingon and poetry written > in "Yodish." A bit disconcerting, to be sure, but possible. Certainly, if > someone in my living room was trying to speak to me in Klingon, I'd > probably request that he get a life, but /personally,/ I can do that. > Professionally I can't; Why? If your customer is Lucas, I bet you will do what he wants and you will send your bill. Open consistency, or scalability. >I can't use the fact that a man who speaks Esperanto, an equally >artificial language, is more likely to have a girlfriend than a man who >speaks Klingon as a justification for design limitations. Again, human >beings /do/ invent languages and any tagging standard which does not >account for this reality is inadequate for the task of classifying human >languages. Right! >Effectively, I look at the language tag in this fashion: if, for any two >random given people, I must use two different syntaxes to say the same >thing, then I need a different tag. en-US and en-GB are different not just >because the British like throwing in superfluous u's, but also because the >word "fanny" gets me in more trouble in one place than in the other. Right! But this may be also true with people who have lived something good/bad in common a word alludes to. >Same with the word "mantequilla" en es-MX versus es-AR. Anyone interested >in providing global content must be able to navigate these differences >/and/ similar unpredictable differences which arise in the future. What >happens, for example, when significant language-use differences are based >on social class in the exact same region in the exact same tongue? 100% agreement. >No existing tag accounts for this; can we be sure that 35 possibilities >for extension protect us against future social and creative inventions? We can be sure of not. And why would pentatridecimal be a limit (why 35 BTW?). >Extensibility and modularity must be incorporated into the system from its >inception or the system will fail. Human creativity and pop culture move >altogether faster than specification revision committees. 100% agreement. But remember you are a programer. So you must consider how to implement it. The work we carry is the following/ 1. two or more people dialoging/polyloging together establish a space of exchanges. This space has common references, we name referents (never mind at this stage what they are). 2. in so doing the establish relations. Each relation has its own context which modify (enrich, reduce, modify, etc.) the referent. 3. the exhanges are carried in using person to person interintelligibility protocols named "languages" supported by the end to end interoperation. 4. the referents and contexts are supported by "CRC" (common reference centers) where all the protocols elements (which extend far further than the pure language dscussed here, but will include the DNS, the LADP Directories being uses, etc.) can be seeked. For example through the DNS, OPES, etc. 5. one of the first thing people will do will be to negotiate the environment of the exchange and of the relation. For example, they will start negotiating the protocol: http/ftp?, English, French, Spanish;;; what is the most adequate to general exchange and to specific relations? You see that for example I use English here, but French would be better for me and I could try Spanish with you. At a given time you negotiate a Spanish context when you quote "mantequilla" (as a French speaker es-MX and es-AR are the same). You see that depending on the language we negotiated at a given time in the exchange or relation, if we quote a Web site, it will be different and if we call an LDAP directory the result may be different if it is multilingual. >Ultimately, these tags are not, and /cannot/ be, proscriptive for how >people are /allowed/ to classify languages; you can't program humans like >a computer. Instead, they need to be descriptive of how human beings /use/ >language--"use" it here in both senses, of how they actually speak, write >or signal it, and also in how they already classify it for their own >purposes of transmission, and that descriptiveness must share the same >capacity for growth as the objects it describes. In other words, the tags >used to describe languages must themselves be like languages: if the >language changes from region to region, so must the tags. If languages >divide or combine over time, so must the tags. And if languages can spring >wholesale from the minds of hack science-fiction screenwriters, so must >the tags. Right. And the way this is done on a distributed network is subsidiarity. This means that one (several?) semantics are to be defined (for mutual understanding, parsing, filtering, etc.) - as this Draft does for one - and people can use it the way they want. When you registered "megared" or "dylanpierce" the rule was first come fist serve. You were free. >Further, if languages can be analyzed for factors important to one >organization but irrelevant to another, so must the tags. The >reading-level example, for instance, is intrinsically part of how language >is used within a culture; the very educational institutions teaching the >language divide material in this fashion. The regional press example >points to how material is requested and provided, still however analyzed >based on sheer linguistic--word choice and level of abstractness--factors. >And since it would be daunting for a registration body to make any attempt >at trying to track and describe the myriad of human possibilities for >interpreting a language, best we put that charge directly in the hands of >the people who do it for a living. "who do it for a living" or "who live with it"? >Certainly this means that corporations will also use their namespace for >less germaine reasons. And fortunately, parsing agents can ignore their >tags and still remain completely in compliance: a small price to pay for >effectively making the entire world a de facto but organized registration >authority for how language is used worldwide. people empowerment is usually the best solution when you deal with people realtions. However you run into conflict easly because people tend to want to do the same thing. You usually have three ways to address this: the centralised hierarchy, which quickly lead to control (this is the system the Draft creates with strict reference to ISO), the decentralised system with needs some kind of technical oligarchy and also has some lose ends (this is the current RFC 3066 system), the distributed system where you do not have registration hierachichal trees but forests (everyone can create its own registry). Dealing with centralisation is quite sipmlifying. And a temptation. But it does not work, because as you describe it the world is centralised, decentralised and distributed at the same time. >I've been informed both here and privately that perhaps a more appropriate >approach would be to wait until this document becomes a standard and then >propose organizational namespace as a new Internet Draft. This is not possible as long as this Draft wants to replace RFC 3066. This is no problem if it complements it. Then there are two solutions: - either you proposition is general and you propose a framework for the Internet language idenfication which will replace RFC 3066 as BCP 47. - or your proposition only specifies one additional semantic in parallel of the current Draft semantic and you refer to RFC 3066 > Certainly, you guys know better than I do what we're up against and I'll > defer to your best judgment. Up to you to decide. If the Draft replaces RFC 3066 your extensions will have to obey the Draft not RFC 3066. Look at what it implies. IMHO if you want to intriduce "p-" you are far better off now than starting a new document. Simply because they want the document to go through and you could block it without "p-" if your "p-" gathers support. Afterward, everyone will tell you "wait for experience" "we have decided no", etc. We have the IDN experience. IDNA (Internationalised Domain Names applications) has been accepted by WG Members on the ground that it would be experimental and if it did not work we could change it. It does not work. M$ do not intend to implement it, yet after two years and China has split from the main Internet with its own co-root on the issue to correct it. IMHO (and we dispute on that, the same as we disputed over IDNs) this Draft if accepted will do the same. So it is very important to give it all the possible flexibility to give it a chance. >But the entire reason I so strongly support this project is because we >undeniably /need/ a parsable internationalization architecture (as a >programmer, /I/ need it, and I have yet to speak to a colleage who accuses >RFC 3066 of being sufficient) and it needs to speak to all the ways in >which languages are used, distributed, selected, and even (perhaps I'm >making enemies here) invented. Full agreement. But I do not think anyone will tell that RFC 3066 is sufficient: RFC 3066 is less restrictive. Actually what is wrong is the "langtag" by itself instead of supporting all the tags people really needs, whithout tying them into rigid supertags. >Extensibility can be anything from a lifesaver to a mere buzzword. For any >descriptor to be extensible in a way which has value, it must be >extensible in exactly the same ways in which the object it describes is >extensible. If it is not, time and human creativity will obsolete it. Amen. jfc _______________________________________________ Ltru mailing list Ltru@lists.ietf.org https://www1.ietf.org/mailman/listinfo/ltru