FW: Language Identifier List Comments, updated

Misha Wolf Misha.Wolf at reuters.com
Thu Dec 30 12:06:36 CET 2004


Martin appears to have sent the mail below to 
www-international at w3.org only.  As some of the 
participants in this debate aren't on that list, 
I'm forwarding his mail.

Misha


-----Original Message-----
From: www-international-request at w3.org
[mailto:www-international-request at w3.org] On Behalf Of Martin Duerst
Sent: 29 December 2004 07:23
To: JFC (Jefsey) Morfin; www-international at w3.org
Subject: Re: Language Identifier List Comments, updated


At 15:18 04/12/27, JFC (Jefsey) Morfin wrote:

 >I gave some thinking to all this and reviewed the documents that W3C
also 
prepare. I am afraid we want to put too many unrelated things into the
same 
debate, due to a confusion between the three internationalization, 
multilingualization and vernacularization layers wich are not identifed
and 
documented yet, while some attempt to discuss what belongs to lingual 
authoritative sources.

This discussion is about language identifiers for content. And on this
list (www-international at w3.org) in particular, about language
identifiers
for Web content.

Language issues for content and language issues for domain name
registrations are quite different.

 >This is only an IETF document,

The document that Tex put up is not an IETF document, just
a Web page put up in the hope to help people making a good
selection for tagging their Web content quickly
(in my opinion, that Web page still has some way to go
to reach that goal, but that's a separate issue).

 >talking only about network interoperablity. It must be consistent with

other RFCs. Other RFCs have defined the Internet language/country 
authorities: RFC 3066bis cannot say otherwise.

RFC 3066 and RFC 3066bis don't define language authority. They just
define
ways to generate or register tags for existing languages.

And I am not aware of an RFC (as opposed to ICANN document) that defines
language authority. (I may have missed one.)

 >As for naming, languages are chosen and documented by the local
internet 
communities, represented by their Trustees, the ccTLD Managers (the SLD 
Manager for privately defined tags).

No, what some ccTLDs are doing is just to document the set of characters
that they accept for a given language. Some ccTLDs (such as .de and .ch)
have carefully avoided doing even that; the set of characters they
accept for IDNs is mostly based on system considerations. (The reason
they have done that may also to some extent be because they don't
think that language is or should be a major determinant for domain
name registry operation; I would agree that script is much more
important).

 >The same as IANA is not in the business of defining countries (RFC
1591), 
IANA is not in the business of defining the languages of the countries.

Neither are ccTLDs. In many countries, they would get into
problems if they tried to do that. Language is much more
than just a set of characters.


 >All what an _RFC_ can say is that language tags identify the IDNA
Tables 
published by the ccTLD Manager, as the Trustee of his local internet 
community (we talk of the language used by network/protocol related 
issues). Or by the SLD Managers for their domain. I certainly favor 
Unicode, locales, contexts, etc. converge, but that rises first many
many 
more multilingual Internet related issues, the RFC 3066bis does not want
to 
discuss.

RFC 3066 and 3066bis codes may be used for labeling sets of characters
used in the domain name system. But compared with their use for labeling
content, and for requesting content,..., such a use is extremely
marginal.
(there are currently maybe a few dozens of such tables, but there are
millions and millions of Web pages, for example).

 >I fully understand that most of the ccTLD Managers have not published 
language tables and that other applications than DNS call for an
immediate 
support, alaso that SLD Manager may need off-the-shelves tables. However

this support by non-ccTLD Managers can only be temporary and MUST be 
eventually consistent with the ccTLD Manager tables such an RFC should
call 
for. Otherwise we have a real layer and autority violation, all the more

than this is not only by RFC 1591, ICANN ICP-1 but also by the WSIS 2003

Resolutions underlinging the sovereignty of Govs over ccTLDs. There is
no 
problem in documenting the duties of a ccTLD Manager in this area and in

discussing it with ccTLDs Managers, as an addition to the ccTLD Manager
BPs.

Again, this is not about 'language tables' for IDN.

 >I would therefore review the ABNF in four areas:
 >- favoring the three letter codes for the language to make this entry 
time independent and consistent (this does not change anything in the 
currenet applications)

No, this would change a lot, because most Web content out there
currently
uses two-letter codes. Also, RFC 3066, for good reasons, prefers
two-letter
codes where available.


Regards,    Martin. 





-------------------------------------------------------------- --
        Visit our Internet site at http://www.reuters.com

Get closer to the financial markets with Reuters Messaging - for more
information and to register, visit http://www.reuters.com/messaging

Any views expressed in this message are those of  the  individual
sender,  except  where  the sender specifically states them to be
the views of Reuters Ltd.



More information about the Ietf-languages mailing list