language tag structure

Mon Jan 17 13:45:39 CET 2005

At 04:59 17/01/2005, John Cowan wrote:
>JFC (Jefsey) Morfin scripsit:
> >     - the authoritative source/reference
>
>What is the purpose of this "authoritative source"?  The RFC 1766 tradition
>uses authorities in order to clearly discriminate one language from another,
>and to make it clear, in the case of multiple languages known by the same
>name, which one is meant.  But those sources are provided in registrations,
>they are not encoded in tags.

Dear John,
This is not because you can accommodate this lack in the tag, that other 
can. I will document that in responding to the M$ issue.

>Your other four components, language, script (not "scripting", please),

Thank you for the "scripting" correction. However I have a question on 
this. I want to indicate the way the text is written - like in 
"handwriting". I feel that one of the problems of this list (well defined 
in BPC 025 § 2.3) is an internal view of the problem at hand. Not of its 
global external impact. In here I do not consider only the ISO scripts 
list, but the real way networked life will consider them as vernacular 
vehicles, including barcodes, RFIDs, voice, menus, scanerised handwriting, 
etc.

Does "script" covers all this? Thank you.

>geographical location, and variant, are provided for informally in RFC 3066
>and are formalized in RFC 3066bis.

The same problem again. The RFC 3066bis does not exist yet and will likely 
not exist. Or it will exist and will be used by the people who drafted it 
but not by those should use it - I am one of them and I cannot use it. 
Leading to confusion and delays. The reason why is that the need is _wider_ 
that its proposition. So, there is no problem in including its support in a 
response to the real needs. This way, you will get the solution you want 
without hurting others' needs.

> >     - the authoritative source/reference is Microsoft (and they miss a
> > _lot_ of words)
>
>Microsoft is not an authoritative source of language definition.
>They are a company that sells word-processing software.

The same problem again. Word is a language vehicle. French letters written 
with the same initial intent on different words processors with different 
orthographic (dictionary) and the different grammatical correction will be 
different. This is no problem for you as your typical need is to document 
to the reader in which language it is written. Because you suppose that the 
reader's understanding will be a super set of the language version used in 
the document: the reader knows more words than present in the text, he can 
use his intelligence to understand the meaning of the ones he ignores or he 
can use a dictionary.

Now, in a computer network this is not the same. If I have three web 
services built under Apple, UNIX and M$ technologies they will not speak 
the same French. They will therefore not be 100% compatible. Even two web 
services under different version of the same technologies will not be 100% 
compatible. They need to use or refer to the same language reference file 
identified by its tag.

I will take an example. There is a major lingual change in France (not in 
French) about the way to address a she-civil servant. Up-to-know the 
function word was masculine and the title indicated the sex. Ex. Madame le 
Ministre. The former government decided that the function would be feminine 
if carried by a woman. Ex. Madame la Ministre. This led to a lot of 
controversies because the change could/should have been Madame la 
Ministresse. In some cases this started being applied with a Firewoman 
being name "Madame la pompière ...". This became a political issue with 
dictionaries taking sides or not.

When I contact a web service or purchase a dictionary on a CD I need to 
know its reference through its tag. I will use this tag to go on the 
language authority site and decide if I want to use its linguistic options.

This is a very politicaly sensible example. There are thousands of them in 
every language - for example language for kids and adults. My job is to 
organize the storing and the retrieval of their authoritative references. 
My users will use my system for their applications and exchanges. If there 
is only one root to both tagging systems there will be a limited number of 
language tag variants. If there is no common root, there will be more 
systems by clever people, and confusion.

This is your decision. In any case, since I cannot use your system, my 
system will exist. Up to you to decide if we join forces.

> > 5. ISO 7000 oriented ICONs
>
>There will never be enough icons even to represent the 7000 languages of the
>world, never mind their subdivisions.  The best indicator of a language is
>usually the name of the language in that very language, as English or Deutsch.

I just documented how there is and you partly repeat it. Standardization is 
not creating brilliant new ideas, it is just trying to stabilize what 
people would intuitively do or accept/

> >     Martin, I have carefully read your IRI draft (10.txt ?) several times.
> > I am not sure I understand everything. This is certainly due to my low IQ.
>
>And Pilate asked [Jesus], Art thou the King of the Jews? And he
>answering said unto them, Thou sayest it.

And Jesus also said we are all brothers and sisters. And if you want to 
lead others be their servant. Being others' servant is the glory of the 
standardizers.

This is why I ask your inputs. From what you responded I see that you see 
no flaw in my position from your point of view. Only that you do not 
understand the interest in your case of what I need in mine. Since my 
position on these point is that contextual defaults are to be permitted, I 
understand you have no opposition.

Thank you.
jfc