Return-Path: Received: from murder ([unix socket]) by eikenes.alvestrand.no (Cyrus v2.2.8-Mandrake-RPM-2.2.8-4.2.101mdk) with LMTPA; Mon, 27 Jun 2005 18:29:43 +0200 X-Sieve: CMU Sieve 2.2 Received: from localhost (localhost.localdomain [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id D9B5261B4B for ; Mon, 27 Jun 2005 18:29:42 +0200 (CEST) Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 21518-09 for ; Mon, 27 Jun 2005 18:29:39 +0200 (CEST) X-Greylist: domain auto-whitelisted by SQLgrey-1.4.8 Received: from megatron.ietf.org (megatron.ietf.org [132.151.6.71]) by eikenes.alvestrand.no (Postfix) with ESMTP id C1D4A61AF3 for ; Mon, 27 Jun 2005 18:29:38 +0200 (CEST) Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1DmwTf-0008UR-0H; Mon, 27 Jun 2005 12:28:55 -0400 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1DmwTc-0008UL-Nz for ltru@megatron.ietf.org; Mon, 27 Jun 2005 12:28:52 -0400 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id MAA09185 for ; Mon, 27 Jun 2005 12:28:49 -0400 (EDT) Received: from montage.altserver.com ([63.247.74.122]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1Dmwsl-0001Iu-PR for ltru@ietf.org; Mon, 27 Jun 2005 12:54:54 -0400 Received: from ver78-2-82-241-91-24.fbx.proxad.net ([82.241.91.24] helo=jfc.afrac.org) by montage.altserver.com with esmtpa (Exim 4.44) id 1DmwTX-00007m-1g for ltru@ietf.org; Mon, 27 Jun 2005 09:28:48 -0700 Message-Id: <6.2.1.2.2.20050627182802.03d3dcf0@mail.afrac.org> X-Mailer: QUALCOMM Windows Eudora Version 6.2.1.2 Date: Mon, 27 Jun 2005 18:28:32 +0200 To: ltru@ietf.org From: r&d afrac Subject: Re: [Ltru] additional changes... Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - montage.altserver.com X-AntiAbuse: Original Domain - ietf.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - afrac.org X-Scan-Signature: 0ff9c467ad7f19c2a6d058acd7faaec8 Cc: X-BeenThere: ltru@lists.ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Language Tag Registry Update working group discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ltru-bounces@lists.ietf.org Errors-To: ltru-bounces@lists.ietf.org X-Virus-Scanned: amavisd-new at alvestrand.no Dear Erkki, You are totally right in your comment. Both in supporting and in objecting in your final part. What I ask authors to do is to address the difficuoty in using the solution adopted everywhere (standards, common life, etc.). This is to define what they mean when they use the terms. This is just an initial recital. There are two main classes to start with: concept or data. This means that "Latn" can mean a general concept which has to be defined by examples, versions, dates, or an actual list of characters. You can use the first one to subjectively classify (no problem for oneself or within for a common culture). You can use the second for a charset. But you may meet a lot of problems due to subjective conflicts. Is French written in Latn: I asked on the Unicode list and I got no satisfactory response - there is one French used char. at least which is not defined in Unicode if I am correct (there are better experts than me in here). BTW we are ISO, so this is not Unicode but ISO 10646. There are the problems you rise about spoken/read texts. Only the written mode (simplest one) is supported while most of the languages are not written. Obviously this is the same for countries and regions. Authors tend to deal with ISO 3166 code as if they represented regions. They do not: they represent the name of countries. Let take a simple example: an English person who lives in the USA writes a text. Will she use "en-us" or "en-gb"? There are good reasons for both. It simply calls for a definition and a rule. A standard is not to impose the ideas of its authors through a small group consensus by exhaustion, but to produce a document everyone on earth will consensually - not approve, but - understand. And the same about languages. What is a language? What is Basic English? Shall I register Franglish? I am sorry but I still do not understand what the Draft is about. Most of the readers I know, agree it can fit documents using the same criteria. They only find legal contracts (because what is good for the defence is good per se) and commercial catalogues (because the proper of publicity is to make people feeling they understand). Nothing from people's real life. When I write your name in a French page, is it French or Finnish? I do not know. No one tells me .... But when asked to decribed the "conventional" semantic of the words he uses (in a different ways of their definition), he responds: IMHO it is not the role of this WG (to say what it is talking about). This leaves room only to common sense. This is very conventional to say that common sense misunderstandings have something is common: they are more than common. I am afraid Peter confuses ISO, where he is to make a list purposedly without practical application in mind, and IETF where he works on a relational protocol between the author of a page, documenting it with a langtag, and the reader of the same page, who should understand it the same way. jfc On 16:14 27/06/2005, Erkki Kolehmainen said: >Dear Mr. Morfin, > >You are so right with your statement "I am not sure anyone knows what a >script can exactly be." - E.g., there are people who insist that >Phoenician is not a script, but rather a glyph variation of Hebrew, and >consequently they would like to prohibit the encoding of Phoenician >characters in ISO/IEC 10646 and Unicode. In spite (and because) of >quarrels like this, defining the script used for the encoding is often >useful. Also, a text may be English, but if it is written and encoded in >Shavian instead of the Latin script, it would be totally useless for me - >among many others - to even open the file. One could argue that if the >text would be rendered e.g. as generated voice, the identification of the >chosen script together with the language is redundant - in principle, but >not quite in practice. I suspect that no voice generator would work on >encodings in both Shavian and Latin scripts or, to be more practical, e.g. >in Cyrillic and Latin scripts, both of which are used for a number of >languages even in the same countries (often together with other >orthographic differences, though). > >The countries and regions of this world are not defined with absolute >precision or optimal granularity, either. Yet, since there is considerable >local variation in nearly all languages, it is often useful to define the >applicable region using the most suitable code. > >The fact that there can be no absolute precision in the definitions of >either the languages, scripts or regions, should not prevent us from >coming up with a practical solution to the problems at hand. > >Sincerely, > >Erkki I. Kolehmainen > >r&d afrac wrote: > >>At 21:37 23/06/2005, Addison Phillips wrote: >> >>>Finally, I added a short description for each subtag type in the section >>>on syntax, as pointed out in a recent thread. These probably bear a look >>>as innovations. >> >>Certainly a good thing. But I am afraid this does not address the lack of >>definition of what all this is about and what is a langtag. Let me try to >>clarifiy in using ISO 3166. ISO 3166 is seven different ways to code the >>name of the countries. You use the ISO 3166 for something else (you name >>it a region and you mix M.49). The same as Jon Postel used it to >>designated ccTLDs. You must define somewhere what you are defining with >>that code. >>For example, nothing could prevent us in this WG to call one another by >>our organisation's name. I would call you Quest. etc. But if we do not >>define it, no one will know if I refer to you or to your organisation or >>to its boss, etc. >>If I am correct (but I did not look in detail), ISO 3166 codes define the >>name of the countries. ISO 639-3 are just codes and autonyms or english >>names are attached to them. ISO15924 is a list and I am not sure anyone >>knows what a script can exactly be (not clear in the text and in >>Michael's references to it: usually discussed as a reference to the >>Unicode script.txt file). All this is not ISO 11179 conformant because a >>metamodel must be homogenous and ... clear. ISO 639-4 tries to relate >>them. This would be a good thing. But we do not know yet if it will, and how. >>Why is important? Because this permits to describe what a langtag is >>about. Charter says that the Draft must follow ISO. If you define >>something which looks homogenous (you did not do it yet, but you >>certainly could), but which is not in tune with ISO we will have >>sometimes, somewhere a conflict. This may seem remote and unimportant to >>you today. The same as it was remote and unimportant for Harald to define >>scripts. Sometimes this will be a big conflict. All the more than you do >>not protect yourself in the introduction in specifying a >>restricted/defined scope. And want to be a BCP 47. >>One of the major discrepancy you face is about "script", because you are >>specifying "langtag" for a multimedia system and use two general >>attributes (languages and country) and a specialised one (script) making >>yourself incompatible with all the non written modes. To correct that it >>is enough to coin an open description of "script" as the descriptor of >>the mode, using ISO 15924 when it is a written mode. >>All this is not complex, but must be done precisely. In an ISO consistent >>way, because the charter says so. >>jfc >> >>_______________________________________________ >>Ltru mailing list >>Ltru@lists.ietf.org >>https://www1.ietf.org/mailman/listinfo/ltru _______________________________________________ Ltru mailing list Ltru@lists.ietf.org https://www1.ietf.org/mailman/listinfo/ltru