Local Internet Communities support & Internet standard process
JFC (Jefsey) Morfin
jefsey at jefsey.com
Tue Dec 28 20:45:55 CET 2004
you rose several questions which relate our difference of vision of the
intended RFC 3066bis deliverable.
I tried to address them in one mail.
1. the purpose and consistency of the Internet standard process.
I consider that an RFC is an Internet building block with a precise purpose
to be perfectly consistent with the other RFCs in the four technical,
societocultural, economical and political areas of usage of network
services. These blocks obey to architectural principles which are defined
in RFC 1958 (ftp://ftp.rfc-editor.org/in-notes/rfc1958.txt). They are of
two kinds: (a) the general principles which are true whatever the network
generation, (b) the principles related to the Internet second network
generation vision. The central general principle is the principle of
constant change. The "KIS", "one single way to achieve the same thing" and
"scalability" principles are of the essence together with the
"international" principle: "5.4 Designs should be fully international, with
support for localization (adaptation to local character sets). In
particular, there should be a uniform approach to character set tagging for
This addresses the reasons why:
- I do not mind changing a four years old undetermined practice as long as
transition is supported.
- I do not accept inconsistency in the way an RFC says something and a
current practice does something else
- I do not accept limited solutions and want to define the general
algorithm they should result from.
- I do not support solutions railroaded by one particular historical
architectural, cultural, economical or political nexus.
I must also indicate that my perspective is the technical, societal,
economical and political usage.
2. Draft RFC 3066bis (I refer to -08 as the alluded -09 is not on-line)
intends to "describe the structure, content, construction, and semantics
of language tags for use in cases where it is desirable to indicate the
language used in an information object. It also describes how to
register values for use in language tags and a construct for matching such
language tags, including user defined extensions for private interchange.".
However this draft says "Rendering of characters based on the content of a
language tag is not addressed in this memo". This is an English (ASCII)
nexus (as if a language used a unique character set). We are therefore left
with a way to partly tag a content as if a language used a unique character
Frankly, I do not understand where that tags can be used, except in
document statistical bases where each field has its single field (the tag
structure does not permit sorting due to variable length core information.
The remark is made that the "industry" widely uses this system. I do not
object the "industry" to use it, but we talk of Internet architecture
components here. RFC 1958 says it is advisable to use existing external
solutions when available, it also says one single solution for the same
problem. RFC 3490 documents IDNA. IANA is to register character sets tables
using language tags as per RFC 3066. Yet RFC 3066 bis does not want to
consider character sets. Either RFC 3490 or RFC 3066 has a bug.
3. Proposed suggestions
3.1. the intent of RFC 3066 is declared as to fulfill the 5.4 requirement
of RFC 1958. If it does not do it, I will submit a draft to that end as we
3.2. Once for all we stop using long definitions in establishing the -0z
numbering (algebraic 0F extended to F) as a basis for the sortable tags and
protocol names. We also open a global interprotocol
error/escape/information list that will be multlingualisable. May be we
also establish the difference between international, multilingual and
vernacular or locale and context, and accept ISO 15924 as the Internet
script code list.
3.3. the resulting tag must permit among others, IANA registrations. This
means to identify ccTLD approved language related character sets. I have
asked the ccTLDs about their position about being language authoritative
for the Internet. Responses will take time. As a ccTLD myself I consider
that there are cons and pros depending on the language. When the language
is not controverted or lax there may be advantages in simplifying many
aspects. In other cases it involves ccTLD Managers in domain they do not
want to be involved, the first one being Industrial Property Protection.
There is therefore a need for an additional mandatory parameter indicating
the functional area of authority (DNS, protocols, IP, legal, etc.).
3.4. WTO fair reciprocity rules most probably impose every TLD to support
every language registration and therefore to support DNS subzones in every
languages. Users will obviously want the same ability for RHS (URL) and LHS
(e-mail address left hand side). This calls for authoritative (for which
function this tag is authoritative) language (which language) community
(each TLD) tags. This defines a first level of Internet standard process
language tags. One can imagine that the name of this tag will include a
function initial, an ISO 639-2 language, a community code (country code or
TLD code to define) and (cf. RFC 1958) and the script code (ISO 15924).
3.5.. RFC 3066 is NOT concerned by the application level (such as W3C,
WebTV, etc.). It is however concerned with network level management.
Browser information, domain name entries, registrations terms and
conditions, support, etc. are to be multilingual. This means that each
Internet language tag should be easily sortable (for network related
functions gain of space) using a fixed format (lan (3), cc/TLD (2),
function (1), ISO 15924 (4 or 3 if numbers are used).
3.6. Information associated with the tags and tag cross referencing should
be considered to define the most adequate default, storing and presentation
formats. Among entries we can consider, there could be:
- authoritative comments, explaining the possible selections made
- usage information
- technicocultural information on man/machine interface
- related glyph
- related sound announcement
- name of the language in the considered scripting
- authoritative source data (address, mail, web site, etc..)
- default language tag if not supported
- error/service messages matrix for this community, language, script, function
- context information URI: foul words, famous names, directory, anti-spam
filters, legal rules
- legally approved TLD translation
3.7. this structure should seamlessly scale to additional communities
identified by their domain name.
3.8. it should document the semantic web system to support all this
information through CRC (community reference centers), an update and
mutual trust based verification system and its intergovernance principles
(secretariat, participants, sovereignty, granularity, subsidiarity, IP,
mutual information, road map, etc.)
3.9 then it should revisit the various existing standards (like the ETSI
document I quoted and ISO list) to determine which links should be intered
in the Internet language tag structure so it could be used as a backbone
for a multilingual Internet initial review.
More information about the Ietf-languages