Message-Id: <6.2.1.2.2.20050706001224.05117b90@mail.afrac.org>
Date: Wed, 06 Jul 2005 01:18:59 +0200
To: "Dylan N. Pierce" <dylanpierce@megared.net.mx>, ltru@ietf.org
From: r&d afrac <rd@afrac.org>
Subject: Re: [Ltru] Private Use Tags
In-Reply-To: <42CB03D4.20801@megared.net.mx>
References: <42CB03D4.20801@megared.net.mx>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: quoted-printable
Cc: 
Precedence: list
Sender: ltru-bounces@lists.ietf.org
Errors-To: ltru-bounces@lists.ietf.org

Dear Dylan,
this request makes a lot of sense. There are many issues there. A few one=
s.

1. you characterise in a way the use of a document. I am not sure this=20
directly fits with the characterisation of a language. But it characteris=
e=20
a relation channel. I mean by that the way the document is intended to be=
=20
received, send or exchanged, and from there classified. Today the propose=
d=20
Draft leaves this undefined in at least two ways:

- first paragraph. It says "Human beings on our planet have, past and=20
present, used a number of languages.  There are many reasons why one woul=
d=20
want to identify the language used when presenting or requesting=20
information.". One could say that "used" may relate to exchanged,=20
"presenting" to send and "requesting" to received, with some variation=20
because for example requesting does no mean that it was received.

- part 2. "The language tag always defines a language as used (which=20
includes being  spoken, written, signed, or otherwise signaled) by=20
human   beings for communication of information to other human beings.=20
Computer languages such as programming languages are explicitly excluded.=
"

The problem here is that a language is not defined (what it is? how is it=
=20
identified? etc.) however the langtag is normative of that something. The=
=20
usage of the proposed langtags can only be subjective (the perception of=20
the users) and the discussed language to be rather undefined concepts. Yo=
ur=20
proposition creates languages _values_ (the instanciation of the Kentucky=
=20
press). I am not sure you can really qualify it within the Draft framewor=
k.=20
This is because it is filtered by a media (Kentucky press) and not by a=20
speakers community (unless you mean the readers  - or the authors? - of t=
he=20
Kentucky press). Please recall that they do not want to accept man/comput=
er=20
and computer/computer languages. This obviously creates a classification=20
problem with StarWars, what is H2D2 speaking? and Yoda who is not a human=
=20
being? The Japanese Fair of Robotics this years, shown Dro=EFds interrela=
ting=20
in Japanese or in English. There is also a vacuum for computer generated=20
texts, alarms, etc. Would you introduce "r" for Robots? but that would=20
oppose the spirit of the Draft?

2. you want to permit organisations and persons to define their personal=20
name space (cf. John Klensin recent Draft on IANA) and define their own=20
format. This was recently proposed and denied. The first problem you meet=
=20
with this is size of the namespace you need and its structure. You consid=
er=20
that Microsoft would register "mcrsoft", why that? Mr. Sungil Yoon who ow=
ns=20
McrSoft.com has the right to use it. You will say that "Microsoft" is=20
longer than 8alpha. Right, but RFC 1766 said that you cannot change that=20
and is to be respected by consensus of this WG. Obviously you can object =
to=20
this consensus, I will too, others will probably too and this will not be=
=20
anymore a consensus. But short of that, only user owning universal rights=
=20
(every class, every country) can claim a tag (like "mercedes" or=20
"cocacola") otherwise we have conflicts.

I note that RFC 2860 can also create a problem for the Draft. Your naming=
=20
part is by essence an ICANN part of the IANA: the Registrar and Examiner=20
must be designated by the ICANN BoD and appeals probably subject to GAC.=20
This should be reflected in the Draft. Is this really what you want?

Another problem you may not have investigated is that Microsoft could hav=
e=20
different branches and needs, for example "microsoft.corp", "microsoft.us=
",=20
etc. the languages spoken in its different branches being certainly=20
different. This is why we have introduced three warnings you can find on=20
http://rfc3066.org:

- there is no subtag size limit in the x-tags part
- the "." and the ":" are accepted characters, "." introducing a comment =
or=20
an additional part and ":" permitting to use URNs? For us=20
"x-en.microsoft.us-Latn-de" qualifies the language of a Mr. Gates visitin=
g=20
Germany.

But please note that, if this proposition, initially presented by F.=20
Charles, cannot conflict with any previous format since it is a private=20
area, it is not supported by this WG, what is deemed to have consensually=
=20
opposed, (one or two objections).

3. you consider the notions of referent (ex.: commonly accepted reading=20
level -L-6) and context (to know about the Kentucky cities and life). The=
se=20
are two levels which are very important to the support of a relation=20
(together with their dates - as per ISO 11179 - to know which version is =
to=20
be used). Other referents can be Dictionaries, Grammars, publishers, etc.=
=20
Other contexts can be style, mimics, accents, etc. These notions are most=
=20
probably too complex for the Draft and can be multiplied and need=20
priorities in case of conflicts when two referential systems have differe=
nt=20
descriptions. Please accept that the Draft only supports one single mode=20
(script) and has no provision (yet) to support other modes.

All this can/should certainly be supported. But this would call for a=20
general framework introduction of language support (BCP 47) within the=20
Internet architecture, as a continuation/extension of the RFC 3066. In th=
is=20
case the Draft would be an application of this framework. The sentence=20
"This document replaces RFC 3066" should then be replaced by "this docume=
nt=20
complements RFC 3066":  this is a part of the debate over the Charter, th=
is=20
WG consensus does seem to want to engage.

4. you say you do not consider that using domain names would be adequate,=
=20
but you do not document it. This is one of the solutions to support=20
individual/avatars and contexts grids. I would therefore be interested yo=
u=20
document your position. This is a point which is hotly debated in some IS=
O=20
committees, and belongs to what is qualified as the "pulverisation" of a=20
user-centric Internet (i.e. its ultimate granularity). Work currently=20
carried one coreboxes and OPES (WG-OPES) go down to this degree and even=20
below (the individual relation level and context: the way you speak when=20
you are with someone else specific, under some identified circumstances.=20
ex: the language you use with a cop who stopped you on the road).

Thank you for this interesting thinking.
jfc



At 00:04 06/07/2005, Dylan N. Pierce wrote:
>(This is a re-send of an e-mail I originally sent to the authors of a=20
>previous draft; I have since been educated as to the proper way to comme=
nt.)
>
>Dear Mr. Phillips and Mr. Davis,
>
>First, please forgive me if I'm not following proper procedure in=20
>commenting on this draft; while I do have a strong programmer's interest=
=20
>in this standard, I admit that I'm not typically a participant in these=20
>procedures and haven't thoroughly educated myself on the policies for=20
>submitting comments.
>
>I would like to recommend an addition to this draft, for which I think I=
=20
>can make a rather compelling case based on hypothetical but quite=20
>reasonable scenarios. Personally, I hope very much that your draft becom=
es=20
>a standard, as the problems with a canonical parsing of current RFC 3066=
=20
>language tags are well-known and bothersome to developers everywhere. Yo=
ur=20
>draft strikes me as an excellent way to finally standardize the practice=
=20
>in a way which will be accessible to all developers without having to=20
>investigate thirty different standards and documents from ten different=20
>organizations.
>
>Regarding Section 3.4 on extensions and extension namespace: You already=
=20
>have here a mechanism in place for extending this specification. I would=
=20
>like to suggest an extension which should probably be incorporated into=20
>the main specification. I believe you should define an "organization=20
>convention" extension for use by private companies and organizations for=
=20
>their own purposes.
>
>I realize that a "private use" extension is already defined in section=20
>2.2.7. However, I maintain that the private use extension is not=20
>sufficient for potential development and interdevelopment among importan=
t=20
>organizations, as there is no way a parsing agent could assume anything=20
>significant about the tags which follow. And yet, the registration of 3.=
4=20
>extensions is also insufficient because, frankly, you'll rapidly run out=
=20
>of letters if you make a sincere effort to define namespace for private=20
>companies and organizations.
>
>Let's take a concrete example. Let's say that the American Library=20
>Association (ALA) decides to define an extension to help them classify=20
>books by reading level. As your specification stands, they have two=20
>choices: they can register a 3.4 extension (we'll say they register "L")=
=20
>and then use their subtags as follows:
>
>en-US-L-g6: A book written in English as spoken in the United States at=20
>the sixth-grade reading level.
>
>The ALA would have excellent reasons for wanting such a tag, as it would=
=20
>greatly facilitate the database querying and transfer of material to=20
>public schools.
>
>However, we see the first problem: the ALA has their tag, which many=20
>schools would use. Then, Associated Press would want their tag to indica=
te=20
>regional assumptions. We'll give them "P" (for "press"):
>
>en-US-P-ky: An article written in English as spoken in the United States=
=20
>which assumes readers are already familiar with names, cities, politics,=
=20
>etc., in Kentucky. (They would use this to distribute versions to Kentuc=
ky=20
>press where they don't have to explain that Frankfurt is the capital,=20
>distinguishing them from national or international versions which would=20
>make no such assumption and explicitly specify that Frankfurt is the cap=
ital.)
>
>If we keep up like this, as I mentioned, we'll rapidly run out of=20
>singleton letters. Everyone will want one, some for valid reasons, other=
s=20
>for silly reasons, and then your registration authority would be in the=20
>unenviable position of having to make value judgments regarding what is=20
>valid and what is silly, given such limited real estate.
>
>Furthermore, you'll be putting the organizations themselves in a difficu=
lt=20
>position. For example, if the ALA decides to modify their convention, th=
is=20
>is something that is only of interest to them and the people who use the=
ir=20
>specification. However, in order to make their own internal changes, the=
y=20
>will technically have to go through the entire process of revising a=20
>stable specification through the registration authority (according to 3.=
4,=20
>which requires stability and canonical representation), something which =
is=20
>never recommendable.
>
>And finally, parsing agents which have no interest in the ALA's tag (whi=
ch=20
>will be most of them) will nonetheless have the burden of checking confo=
rmance.
>
>If we take the other approach, and say, "We have the 'x' tag for private=
=20
>use. The ALA and AP can take that tag and follow it up however they want=
,"=20
>then we're creating another problem. All of the parsing agents which do=20
>have an interest in those tags cannot be guaranteed that they mean what=20
>they think they mean.
>
>For example, if the ALA decides to go with:
>
>en-US-x-ala-g6
>
>But subsequently the Associate Press decides that their private tag=20
>"x-ala" means articles of interest to Alabamans, then what's the ALA do =
to=20
>when they want to classify articles written by the AP? The problem is th=
at=20
>parsing user agents will be unable to assume anything about the tag that=
=20
>follows, and once a conflict occurs, both tags become either useless, or=
=20
>subject to the type of interpretation that a human might perform easily=20
>but a machine cannot.
>
>The solution is simply to define an organizational namespace. We take a=20
>random tag--we'll say "P" for private--and then allow companies and=20
>organizations to register their own namespace. Everything that follows=20
>their namespace tag is then interpreted according to their standard,=20
>whatever that may be. For example, the ALA would register "ala," the AP=20
>would register "ap," Microsoft would register "mcrsoft," Adobe would=20
>register "adobe" and so on.
>
>Then, anyone seeing a tag like this:
>
>en-US-P-ala-g6
>
>could know unambiguously that whatever follows the P-ala is to be=20
>interpreted by the ALA's own convention, whatever that might be. Each=20
>registering organization could then be responsible for the stability and=
=20
>canonical representations of their own namespace without affecting the=20
>stability of the specification as a whole.
>
>Parsing agents which are not interested in the AP's tags simply knows to=
=20
>ignore anything after the "P" tag that isn't an organization in which it=
=20
>has an interest. Parsing agents that are interested can now know with=20
>assurance that the information is what they're looking for. Companies an=
d=20
>organizations can establish their own standards which can easily evolve =
to=20
>suit their needs. Private companies can establish compatibility standard=
s=20
>between themselves which won't affect the specification as a whole.
>
>This could be infinitely extensible merely by setting aside one of the=20
>organizational tags to mean "check the next set." For example, if the=20
>American Library association registers "ala" as above, and then later th=
e=20
>Association of Libertarians and Anarchists shows up, finds that all the=20
>mnemonic representations of their name are already used and there's not=20
>much space left on the registery (and with 368 alphanumeric possibilitie=
s,=20
>that's not likely, but let's pretend), they could define their namespace=
=20
>as "set2-ala" (assuming we've already decided that "set2" is the tag whe=
n=20
>means "check the next set").
>
>This allows all companies and organizations which have a need to define=20
>their own namespaces and then use them as the needs of their particular=20
>domain indicate in a way that is nonetheless unambiguously established f=
or=20
>parsing agents which can then make error-free decisions about whether or=
=20
>not the information which follows is useful to their needs, all done=20
>without sacrificing the stability of the main specification.
>
>This is the extent of my speculation on the issue. I did consider the=20
>possibility of using Java-package-name-like identifiers tied to domain=20
>registration, so that Microsoft could have the "com-microsoft" tag and t=
he=20
>ALA could have the "org-ala" tag, but this would end up violating the=20
>eight-character rule and allow just any yahoo with a website to include=20
>whatever he sees fit (en-US-com-sexychicks-38D comes to mind), which I=20
>don't think is a desirable solution at all.
>
>If you have found this comment at all useful, I would appreciate hearing=
 back.
>
>Sincerely,
>Dylan N. Pierce
>IT Coordinator, TykeTek
>
>TykeTek/Diapositivas Gloria
>Salvador Quevedo y Zubieta #821 Int. 6
>Col. la Perla
>C.P. 44360 Guadalajara, Jal.
>MEXICO
>
>E-Mail: dylanpierce@megared.net.mx
>Telephone: +52 (33) 3617.3660
>Cellular: +52 (33) 1149.7057
>
>_______________________________________________
>Ltru mailing list
>Ltru@lists.ietf.org
>https://www1.ietf.org/mailman/listinfo/ltru
>


_______________________________________________
Ltru mailing list
Ltru@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/ltru