Return-Path: Received: from murder ([unix socket]) by eikenes.alvestrand.no (Cyrus v2.2.8-Mandrake-RPM-2.2.8-4.2.101mdk) with LMTPA; Thu, 19 May 2005 17:07:45 +0200 X-Sieve: CMU Sieve 2.2 Received: from localhost (localhost.localdomain [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id E6B8061B7B for ; Thu, 19 May 2005 17:07:44 +0200 (CEST) Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 22010-02 for ; Thu, 19 May 2005 17:07:40 +0200 (CEST) X-Greylist: domain auto-whitelisted by SQLgrey-1.4.8 Received: from unicode.org (unicode.org [69.13.187.164]) by eikenes.alvestrand.no (Postfix) with ESMTP id B9EE861B73 for ; Thu, 19 May 2005 17:07:39 +0200 (CEST) Received: from sarasvati.unicode.org (localhost [127.0.0.1]) by unicode.org (8.12.11/8.12.11) with ESMTP id j4JF6eV6012248; Thu, 19 May 2005 10:06:40 -0500 Received: with ECARTIS (v1.0.0; list unicode); Thu, 19 May 2005 10:06:40 -0500 (CDT) Received: from montage.altserver.com (montage.altserver.com [63.247.74.122]) by unicode.org (8.12.11/8.12.11) with ESMTP id j4J29PqV018071 for ; Wed, 18 May 2005 21:09:28 -0500 Received: from lns-p19-8-idf-82-249-8-90.adsl.proxad.net ([82.249.8.90] helo=jfc.afrac.org) by montage.altserver.com with esmtpa (Exim 4.44) id 1DYaTU-00062Z-5C; Wed, 18 May 2005 19:09:24 -0700 Message-Id: <6.2.1.2.2.20050519013832.04254060@mail.jefsey.com> X-Mailer: QUALCOMM Windows Eudora Version 6.2.1.2 Date: Thu, 19 May 2005 03:03:10 +0200 To: "Peter Constable" , From: "JFC (Jefsey) Morfin" Subject: RE: what is Latn? In-Reply-To: References: Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: 8bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - montage.altserver.com X-AntiAbuse: Original Domain - unicode.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - jefsey.com X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 20053 X-Approved-By: root@unicode.org X-ecartis-version: Ecartis v1.0.0 Sender: unicode-bounce@unicode.org Errors-To: unicode-bounce@unicode.org X-original-sender: jefsey@jefsey.com Precedence: bulk List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-ID: X-List-ID: X-list: unicode X-Virus-Scanned: amavisd-new at alvestrand.no At 17:56 18/05/2005, Peter Constable wrote: > > But when you have orthogonal things to relate, as you want to in > > several documents, you need to have a relational system. > >I don't disagree; I was only objecting to the critique that ISO 15924 is >faulty because it isn't relational in the sense you refer to. ??? Here I am lost. I do not see how ISO 15924 could be faulty. It is a list. There are billions of lists. They what they are: lists. They can be piles of names, numbers, poems, etc. Now where there is a problem is when you want to use some of their items without having given them a meaning first, which means "item=definition". > > As a computer you go by binary stuff. If a computer is to relate >French and > > Latn, it must have binary element it can compare using a program. > > > > Now, a person with a bit of logic will do the same. > > > > As long as you do not tell me what is in Latn, I cannot tell you if it >Latn > > applies to French. > >If you want to dumb down to a level of not assuming anything that's not >stated explicitly, then you are right. I doubt there are many here who >operate that way on a regular basis. This is the difference between poets and programer. We both live in the same world, but the poet believe his dream is enough, the programer knows that he has to declare the things before using them - and that gives him a lot of possibilities. You are a smart guy but you are used: your document quoted in ISO 639-4 is used to support an erroneous proposition I do not think you really support if you analyse it. Please consider carefully what happens in real life. Your case and mine. 1. your case. You think you can assume things which have not been stated explicitly. I have no problem with that, but in programing (or physic, or mathematics) it is named a constant. This means this not even a default, it means that this is something (even if you do not realise it) you assume as universal, created in. This is good for a few universal constants like the speed of the light, etc. But others are actually common understandings, i.e. part of a culture. The more they are, the less all of them are assumed the same by everyone. You say I do not need to define "Latn". May be you talking with Mark, Phillip and Michael. But if I ask (this what I did) to a Unicode list "what is Latn", responses are numerous and confuse. You believed you could assume there is only one subjective, precise, intuitive, etc. I do no know, but one obvious response. There is none: there is a controversy. Otherwise the thread would be closed. Result is that you are mudded trying to define that single meaning you assumed. 2. my case. I know that Michael did his home work and Latn is a good name for a script. I know that a script is a "set of graphic characters used for the written form of one or several languages" (even if I have some problems with that definition). This does not tell me what is the set for "Latn". So I can ask the Unicode list about Latn and French, and get from some the components which should be in the set, there are people in here quite knowledgeable. What is interesting is that I can do the list as a partition of the ISO 10646 global character set (actually I cannot, but nearly). This is interesting because I can now work on several defined alternatives. Where you fought to try to define a concept, I have different names lists, perfectly clear to others, stable and workable. I can given them variant numbers, discuss them, tune them ... and say if yes or not one of them support French. And if the same/others ones support other languages. To do that I am not going to argue for hours about the particular case of that letter in that language, etc. I am going to look at the norms associated to that language and at the associated alphabet. If it matches one of the variant, that variant is correct. You are going to tell me "which norms"? This is the main problem in the whole Davis Addison / Constable logic (quotes of the ISO 639-4 draft): you also assume that you also do not need to consider the norms (except orthography (why?)). This is not because they may not be documented that norms do not exist (grammar, semantic, styles, level of complexity, etc. ). I accept this is less worked in English than in French (actually it is quite worked in English/American, but you do not realise it: consider the lingual obligations put by the DoD to its contractors, consider the very concept of "Basic English") - but you confused language with a norm set with "computer languages". (By the way I do not find the English mathematic word equivalent to "normé", meaning the norms of which have been declared or identified - not decided/invented). The very key element missing (and for the time being) killing all the logic of your ISO/IETF proposition is to forget about the way people speak - their best common relational practices. The normating rules set associated to languages, structuring its reference system (I abbreviate as "referent"). This is what permits a computer to understand, correct, and talk them. Then you have the "style", the way they use it, to fully qualify the language - with possible iterative/reciprocal influences. Now, I fully accept that you can document a language/page in using intuitive/subjective/assumed descriptors only, but for humans who will post-assume (with occasional misunderstanding most probably) what you meant to say - what you have yourself assumed. But you cannot for applications. And the risk of confusions/conflicts will probably be very high if that humans are from a different culture). This is why there is a major difference between an informative and a normative description, the very first point to discuss about terms and definitions/purposes. The very first question to rise in point ISO 639-4/4.8 and in the xx.txt Drafts. Even - and may be them first - your "end users" (I understand as "out of the reach of SDOs") can understand that: I tend to observe it is more difficult for experts who are more involved in their stuff. But this is a discussion we already had. > > And please do not quote Unicode Character Set as a middle reference. > > It is not an ISO Standard, and it does not fully support French. > >Not fully support French? Funny thing, then, that no national body of a >country with a significant Francophone population has been bringing >their request to WG2, and that no member of the Unicode Consortium >selling products to Francophone markets has been requesting changes >needed to meet the requirements of those markets. I'm curious to know >what the lacuna might be. Funny thing, that you are so unaware of Microsoft products and clients. You did not know about non ASCII programing environment, now that. Please ask your people from Word the compromise they found to support question marks at the end of a sentence, obliging all of us to rephrase if Word wants to put it at the beginning of the next line :-) Best that the Unicode lacks, but not perfect. The story about the horses asses, and the comment about a different origin, were interesting: the negative comment did not realise that his more modern origin was the ponies asses in British mines. Unicode or ISO or ECMA, etc. is not the origin: origin is the characters and the people. But the story also shows the hysteresis. We will not rebuild Unicode, it is a step ahead which will stay. But there will be other steps. You give the response: "members of the Unicode Consortium selling". We take Unicode as a good commercial effort by a private company cartel with due commercial motivations. Even if IAB lacks understanding about languages, they understand R&D funding and results. They have perfectly qualified Unicode (as most of the current efforts) in RFC 3869. We/I share that view. What is interesting however, is that the work I do on CRCs, shows us as we can canonically, in a language and koine/autonym independent way use and correct the Unicode lacks ... provided a few more common sense practice and concepts are included in ISO 639-4, IETF Drafts and global network culture. Please consider OSI if you known it. It was the international network second generation: it was specified in four or six languages and the technology was totally multilingual as being bit oriented. What OSI did, we should be able to make it better. Take care. jfc