Message-Id: <6.2.1.2.2.20050627182802.03d3dcf0@mail.afrac.org>
Date: Mon, 27 Jun 2005 18:28:32 +0200
To: ltru@ietf.org
From: r&d afrac <rd@afrac.org>
Subject: Re: [Ltru] additional changes...
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Cc: 
Precedence: list
Sender: ltru-bounces@lists.ietf.org
Errors-To: ltru-bounces@lists.ietf.org

Dear Erkki,
You are totally right in your comment. Both in supporting and in objecting 
in your final part.

What I ask authors to do is to address the difficuoty in using the solution 
adopted everywhere (standards, common life, etc.). This is to define what 
they mean when they use the terms. This is just an initial recital. There 
are two main classes to start with: concept or data. This means that "Latn" 
can mean a general concept which has to be defined by examples, versions, 
dates, or an actual list of characters. You can use the first one to 
subjectively classify (no problem for oneself or within for a common 
culture). You can use the second for a charset. But you may meet a lot of 
problems due to subjective conflicts. Is French written in Latn: I asked on 
the Unicode list and I got no satisfactory response - there is one French 
used char. at least which is not defined in Unicode if I am correct (there 
are better experts than me in here). BTW we are ISO, so this is not Unicode 
but ISO 10646. There are the problems you rise about spoken/read texts. 
Only the written mode (simplest one) is supported while most of the 
languages are not written.

Obviously this is the same for countries and regions. Authors tend to deal 
with ISO 3166 code as if they represented regions. They do not: they 
represent the name of countries. Let take a simple example: an English 
person who lives in the USA writes a text. Will she use "en-us" or "en-gb"? 
There are good reasons for both. It simply calls for a definition and a 
rule. A standard is not to impose the ideas of its authors through a small 
group consensus by exhaustion, but to produce a document everyone on earth 
will consensually - not approve, but - understand.

And the same about languages. What is a language? What is Basic English? 
Shall I register Franglish?

I am sorry but I still do not understand what the Draft is about. Most of 
the readers I know, agree it can fit documents using the same criteria. 
They only find legal contracts (because what is good for the defence is 
good per se) and commercial catalogues (because the proper of publicity is 
to make people feeling they understand). Nothing from people's real life. 
When I write your name in a French page, is it French or Finnish? I do not 
know. No one tells me ....

But when asked to decribed the "conventional" semantic of the words he uses 
(in a different ways of their definition), he responds: IMHO it is not the 
role of this WG (to say what it is talking about). This leaves room only to 
common sense.

This is very conventional to say that common sense misunderstandings have 
something is common: they are more than common.
I am afraid Peter confuses ISO, where he is to make a list purposedly 
without practical application in mind, and IETF where he works on a 
relational protocol between the author of a page, documenting it with a 
langtag, and the reader of the same page, who should understand it the same 
way.

jfc





On 16:14 27/06/2005, Erkki Kolehmainen said:
>Dear Mr. Morfin,
>
>You are so right with your statement "I am not sure anyone knows what a 
>script can exactly be." - E.g., there are people who insist that 
>Phoenician is not a script, but rather a glyph variation of Hebrew, and 
>consequently they would like to prohibit the encoding of Phoenician 
>characters in ISO/IEC 10646 and Unicode. In spite (and because) of 
>quarrels like this, defining the script used for the encoding is often 
>useful. Also, a text may be English, but if it is written and encoded in 
>Shavian instead of the Latin script, it would be totally useless for me - 
>among many others - to even open the file. One could argue that if the 
>text would be rendered e.g. as generated voice, the identification of the 
>chosen script together with the language is redundant - in principle, but 
>not quite in practice. I suspect that no voice generator would work on 
>encodings in both Shavian and Latin scripts or, to be more practical, e.g. 
>in Cyrillic and Latin scripts, both of which are used for a number of 
>languages even in the same countries (often together with other 
>orthographic differences, though).
>
>The countries and regions of this world are not defined with absolute 
>precision or optimal granularity, either. Yet, since there is considerable 
>local variation in nearly all languages, it is often useful to define the 
>applicable region using the most suitable code.
>
>The fact that there can be no absolute precision in the definitions of 
>either the languages, scripts or regions, should not prevent us from 
>coming up with a practical solution to the problems at hand.
>
>Sincerely,
>
>Erkki I. Kolehmainen
>
>r&d afrac wrote:
>
>>At 21:37 23/06/2005, Addison Phillips wrote:
>>
>>>Finally, I added a short description for each subtag type in the section 
>>>on syntax, as pointed out in a recent thread. These probably bear a look 
>>>as innovations.
>>
>>Certainly a good thing. But I am afraid this does not address the lack of 
>>definition of what all this is about and what is a langtag. Let me try to 
>>clarifiy in using ISO 3166. ISO 3166 is seven different ways to code the 
>>name of the countries. You use the ISO 3166 for something else (you name 
>>it a region and you mix M.49). The same as Jon Postel used it to 
>>designated ccTLDs. You must define somewhere what you are defining with 
>>that code.
>>For example, nothing could prevent us in this WG to call one another by 
>>our organisation's name. I would call you Quest. etc. But if we do not 
>>define it, no one will know if I refer to you or to your organisation or 
>>to its boss, etc.
>>If I am correct (but I did not look in detail), ISO 3166 codes define the 
>>name of  the countries. ISO 639-3 are just codes and autonyms or english 
>>names are attached to them. ISO15924 is a list and I am not sure anyone 
>>knows what a script can exactly be (not clear in the text and in 
>>Michael's references to it: usually discussed as a reference to the 
>>Unicode script.txt file). All this is not ISO 11179 conformant because a 
>>metamodel must be homogenous and ... clear. ISO 639-4 tries to relate 
>>them. This would be a good thing. But we do not know yet if it will, and how.
>>Why is important? Because this permits to describe what a langtag is 
>>about. Charter says that the Draft must follow ISO. If you define 
>>something which looks homogenous (you did not do it yet, but you 
>>certainly could), but which is not in tune with ISO we will have 
>>sometimes, somewhere a conflict. This may seem remote and unimportant to 
>>you today. The same as it was remote and unimportant for Harald to define 
>>scripts. Sometimes this will be a big conflict. All the more than you do 
>>not protect yourself in the introduction in specifying a 
>>restricted/defined scope. And want to be a BCP 47.
>>One of the major discrepancy you face is about "script", because you are 
>>specifying "langtag" for a multimedia system and use two general 
>>attributes (languages and country) and a specialised one (script) making 
>>yourself incompatible with all the non written modes. To correct that it 
>>is enough to coin an open description of "script" as the descriptor of 
>>the mode, using ISO 15924 when it is a written mode.
>>All this is not complex, but must be done precisely. In an ISO consistent 
>>way, because the charter says so.
>>jfc
>>
>>_______________________________________________
>>Ltru mailing list
>>Ltru@lists.ietf.org
>>https://www1.ietf.org/mailman/listinfo/ltru


_______________________________________________
Ltru mailing list
Ltru@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/ltru