Local Internet Communities support & Internet standard process

JFC (Jefsey) Morfin jefsey at jefsey.com
Tue Dec 28 20:45:55 CET 2004

Dear Doug,
you rose several questions which relate our difference of vision of the 
intended RFC 3066bis deliverable.
I tried to address them in one mail.

1. the purpose and consistency of the Internet standard process.

I consider that an RFC is an Internet building block with a precise purpose 
to be perfectly consistent with the other RFCs in the four  technical, 
societocultural, economical and political areas of usage of network 
services. These blocks obey to architectural principles which are defined 
in RFC 1958 (ftp://ftp.rfc-editor.org/in-notes/rfc1958.txt). They are of 
two kinds: (a) the general principles which are true whatever the network 
generation, (b) the principles related to the Internet second network 
generation vision. The central general principle is the principle of 
constant change. The "KIS", "one single way to achieve the same thing" and 
"scalability" principles are of the essence together with the 
"international" principle: "5.4 Designs should be fully international, with 
support for localization (adaptation to local character sets). In 
particular, there should be a uniform approach to character set tagging for 
information content.".

This addresses the reasons why:

- I do not mind changing a four years old undetermined practice as long as 
transition is supported.
- I do not accept inconsistency in the way an RFC says something and a 
current practice does something else
- I do not accept limited solutions and want to define the general 
algorithm they should result from.
- I do not support solutions railroaded by one particular historical 
architectural, cultural, economical or political nexus.

I must also indicate that my perspective is the technical, societal, 
economical and political usage.

2. Draft RFC 3066bis (I refer to -08 as the alluded -09 is not on-line) 
intends to "describe the structure, content, construction, and    semantics 
of language tags for use in cases where it is desirable to indicate the 
language used in an information object.  It also   describes how to 
register values for use in language tags and a construct for matching such 
language tags, including user defined extensions for private interchange.".

However this draft says "Rendering of characters based on the content of a 
language tag is not addressed in this memo". This is an English (ASCII) 
nexus (as if a language used a unique character set). We are therefore left 
with a way to partly tag a content as if a language used a unique character 

Frankly, I do not understand where that tags can be used, except in 
document statistical bases where each field has its single field (the tag 
structure does not permit sorting due to variable length core information.

The remark is made that the "industry" widely uses this system. I do not 
object the "industry" to use it, but we talk of Internet architecture 
components here. RFC 1958 says it is advisable to use existing external 
solutions when available, it also says one single solution for the same 
problem. RFC 3490 documents IDNA. IANA is to register character sets tables 
using language tags as per RFC 3066. Yet RFC 3066 bis does not want to 
consider character sets. Either RFC 3490 or RFC 3066 has a bug.

3. Proposed suggestions

3.1. the intent of RFC 3066 is declared as to fulfill the 5.4 requirement 
of RFC 1958. If it does not do it, I will submit a draft to that end as we 
need it.

3.2. Once for all we stop using long definitions in establishing the -0z 
numbering (algebraic 0F extended to F) as a basis for the sortable tags and 
protocol names. We also open a global interprotocol 
error/escape/information list that will be multlingualisable. May be we 
also establish the difference between international, multilingual and 
vernacular or locale and context, and accept ISO 15924 as the Internet 
script code list.

3.3. the resulting tag must permit among others, IANA registrations. This 
means to identify ccTLD approved language related character sets. I have 
asked the ccTLDs about their position about being language authoritative 
for the Internet. Responses will take time. As a ccTLD myself I consider 
that there are cons and pros depending on the language. When the language 
is not controverted or lax there may be advantages in simplifying many 
aspects. In other cases it involves ccTLD Managers in domain they do not 
want to be involved, the first one being Industrial Property Protection. 
There is therefore a need for an additional mandatory parameter indicating 
the functional area of authority (DNS, protocols, IP, legal, etc.).

3.4. WTO fair reciprocity rules most probably impose every TLD to support 
every language registration and therefore to support DNS subzones in every 
languages. Users will obviously want the same ability for RHS (URL) and LHS 
(e-mail address left hand side). This calls for authoritative (for which 
function this tag is authoritative) language (which language) community 
(each TLD) tags. This defines a first level of Internet standard process 
language tags. One can imagine that the name of this tag will include a 
function initial, an ISO 639-2 language, a community code (country code or 
TLD code to define) and (cf. RFC 1958) and the script code (ISO 15924).

3.5.. RFC 3066 is NOT concerned by the application level (such as W3C, 
WebTV, etc.). It is however concerned with network level management. 
Browser information, domain name entries, registrations terms and 
conditions, support, etc. are to be multilingual. This means that each 
Internet language tag should be easily sortable (for network related 
functions gain of space) using a fixed format (lan (3), cc/TLD (2), 
function (1), ISO 15924 (4 or 3 if numbers are used).

3.6. Information associated with the tags and tag cross referencing should 
be considered to define the most adequate default, storing and presentation 
formats. Among entries we can consider, there could be:

- authoritative comments, explaining the possible selections made
- usage information
- technicocultural information on man/machine interface
- related glyph
- related sound announcement
- name of the language in the considered scripting
- authoritative source data (address, mail, web site, etc..)
- default language tag if not supported
- error/service messages matrix for this community, language, script, function
- date
- context information URI: foul words, famous names, directory, anti-spam 
filters, legal rules
- legally approved TLD translation
- etc.
- MD5

3.7. this structure should seamlessly scale to additional communities 
identified by their domain name.

3.8. it should document the semantic web system to support all this 
information through CRC (community reference centers), an  update and 
mutual trust based verification system and its intergovernance principles 
(secretariat, participants, sovereignty, granularity, subsidiarity, IP, 
mutual information, road map, etc.)

3.9 then it should revisit the various existing standards (like the ETSI 
document I quoted and ISO list) to determine which links should be intered 
in the Internet language tag structure so it could be used as a backbone 
for a multilingual Internet initial review.


More information about the Ietf-languages mailing list