IETF language tags list

JFC (Jefsey) Morfin jefsey at
Tue Jun 14 00:35:42 CEST 2005

At 18:50 13/06/2005, Michael Everson wrote:
>For the record, I consider the attack about "layer violation" to be yet 
>another example of the venom I referred to earlier.

I know you are not a programer so I understand you do not understand what 
this really means. This is the same as saying that the Internet is an 
unreliable technology. This is just a description. But since you feel hurt 
and are yourself hurting, I will explain. Sorry if it is technical, after 
all this is the real core of our disagreement, may be detailing it that 
deep will help?

When you consider a language in ISO 639-1, 2, 3 (what this list works on), 
it is a concept (we both understand what it is, not a computer). That 
concept can be documented, for explanation, by one or two references 
(books, articles, etc.). Again this is what this list does.

Now, when you start considering a substantial number of books (as advised 
to Karen), the purpose is to verify if there are several instantiation of 
the language, if it is a dialect, etc. (please let not to confuse with the 
application "I had a dream" quoted by Mark of the Luther King's American 
language instantiation).

When you directly relate concepts with values you have by nature a layer 
violation (like if I entered "Michael Family-Name" in a base). It may not 
be very apparent when you say "one-French in a Latin Script from France": 
but if you think "French" instead of "one French" it is a layer violation 
and sooner or later attached relations will openly conflict. This is the 
flaw in generalizing the debate on this list. I detail:

There is  a difference of nature between English in ISO 639-X (concept) and 
English in ISO 639-Y (value) if are retained ISO 639-4 guidelines similar 
to the one we retained in complying with ISO 11179 and ISO 12620. Why ? 
Because there may be many instantiations of English ISO 639-X, but what is 
taken as English in ISO 639-Y might be what has positively retain by a 
filter saying "if there are more than % 'the' tokens and % 'and' tokens 
etc. this is English". As long as ISO 639-4 guidelines are not finalised we 
frankly do not know where we go.

This means that "gsw" as future ISO 639-3 or "gsw" as registered today as 
ISO 639-2 may turn to be different by nature. This is what I translated in 
my mail telling Peter to give a degree of liberty to the installation of 
the language (referent) and of the user's usage (style).This degree of 
liberty is a different level (like in the DNS) and possibly layer (because 
the precedent level is metadata to the next one).

This may look silly, but if you do not conduct that analysis carefully you 
get yourself trapped into very complex situations. The Internet technology 
is made of many layer violations due to its current use of default 
architecture parameters (one single name system, one single adressing, one 
single IANA, one single class, one single character space, etc.). This 
hides them. The complexity of a Multilingual Internet broadly lies in this. 
Just consider Classes. One of the reasons I am tough on langtags is that 
langtags should make Multilingual Internet classes: the DNS can support 
56.000+ classes, so there is room but not enough for all the langtags this 
list _could_ register. There will also be probably many demands for 
non-lingual classes (security, priorities, public services, corporate, 
cultures, family protection etc.). We are going to negotiate class 
allocation: this will be up to this list and we will have to take into 
account what has been registered. A possible nightmare.

Let me say I register "i-gsw". You will probably have to agree. Now I will 
ask you its registration number, you will be puzzled but I will eventually 
get one from IANA. And I will then make a BCP informing the Internet 
community that I will run an Swiss German externet based upon that class 
number to avoid conflicts with the Chinese externets CNNIC could start 
using the tags you registered for Mike (who will be worried as he may lose 
a bigger opportunity than he thought). Anyway, I would have carried 
everything according to every RFC present and proposed. So would have 
others. Yet we would have created a mess. Because the underlying concepts 
are in layer violation. No one has considered classes, ... yet. Except John 
Klensin (but only one for all the langtags ...) and ICANN four years ago, 
and Bob Tréhin and Joe Rinde to establish the first international system, 
OSI copied as CUGs and we are working on now.

I know this is complex. But it is not hurting, it is not "venom", it is the 
very core of this list's mission and the reason why I say that it should be 
presented on the IANA site with the name and the exposure resulting from 
RFC 3066. And a network architect to advise you (not from the IESG as for 
the time being they have totally overlooked the problem). To better 
analysis this you can read


More information about the Ietf-languages mailing list