(ieft) IDN security violation? Please comment
JFC (Jefsey) Morfin
jefsey at jefsey.com
Sun Feb 13 00:09:34 CET 2005
Part of a mail send on the IETF list and copied to the
ietf-languages at alvestrand.no list that RFC 3066 defines as the mailing list
where IANA language tags are to be reviewed for proposition of registration
to the IANA by Mr. Michael Everson (designated by the IESG). Address for
joining the list or consulting its archives
May be some analysis to structure the debate. Lingual digital relations are
supported through three layers: (1) computer interoperability, (2) human
interintelligibility, (3) human interface.
- at layer 2 relations are brain to brain and support interintelligibility
in using written languages. The scripts of these languages are supported
through the Unicode system and are to be tagged for computer recognition.
- at layer 1 relations are end to end and support interoperability in
using protocols with various digital, hexa, 7 or 8 bits coding and
parameter systems registered with the IANA.
One of these protocol is the DNS which uses a "-.0Z" numbering plan within
the 7 bits area, simplifying its human utilization by reference to Arab 0-9
universally used characters and internationally used Roman A-Z characters.
This also permits an easy bridging with other plans restricted to 0-9, O-B,
or 0-F and the direct support of telephone numeric names. It has a direct
total or partial mnemonic capacity for persons having English, Latin or
Latin scripted languages.
Internationalization, at end to end layer, permits (punycode in the DNS
case, not defined in the email LHS) to support a multilingualization at
brain to brain layer and to provide the same mnemonic capacity to people
having other languages. Vernacularization is the process which permits
human interfaces and applications processes to fully take advantage of
multilingualization, in usage cases ranging from language menus or combos
to full IRI support.
A common problem is to overlook the multilingualization layer because it is
transparent in English (an ASCII string is not affected by punycode). This
layer violation creates the discussed security violation. This layer
violation is the Verisign's disrespect of the ICANN requirements (at
multilingualization layer) requiring the registration of IDNs using codes
from a single language Table.
This common overlook of the multilingualization layer is aggravated by the
proposition of a unique internationalization layer langtag (independent
from IDN language Tables) where it does not belong: to describe all the
vernacular views of a language.
IMHO, a correct generalized approach of multilingualism in the Internet
consists in structurally acknowledging the three layers permitting to
clearly tell the users in which exact context they are. This should be
based upon a five constructors language tag (lang5tag):
- three internationalization layer descriptors. They are used to register
the IDN Tables: the language, the script and the domain of use. The RFC
3066 define the use of ISO 639 codes for the language. RFC 3066bis proposes
to use the codes of ISO 3166 for national domains and ISO 15924 for the
scripts. This is a basic correct proposition, there are more general and
more precise sources if needed.
- a multilingualization layer descriptor: the authoritative reference for
the considered view of the language.
- a vernacularization layer descriptor: the style, that is the environment
of the considered application (protocol, administrative, familial, formal,
commercial, SMS, adult, etc.)
This lang5tag should be part of the IRI description, and supported by an
icon to be shown in the browser bar. An example: if you send a mail your
boss secretary will print and present in his daily folder, you may want him
to know you sent it from a Chinese mobile instead of from your English text
processor. An ISO 7000 conformant glyph system can probably be designed.
More information about the Ietf-languages