IDNA 2008 network neutrality, security, operations and equal treatment

Sun Dec 20 20:26:44 CET 2009

Dear Lisa,

It took a few days for france at large to consider all your mails in
attachment. We found that they raise three fundamental problems that have
not been discussed yet but that now needs to be addressed prior to any
publication of the IDNA2008 documentation set.

These problems are the network neutrality, security, and operative
independence that are to be enhanced or at least respected when introducing
IDNA2008. However, IDNA2008 is Internet neutral; it can be tested,
introduced, and supported in a non-neutral manner. IDNA2008 is both a
protocol and architecture. The protocol security aspects are covered in the
current security section, but the architectural security aspects are not
discussed.

Moreover:

- IDNA2008 represents a substantial technical progress of the Unicode
globalization towards a Multilingual Internet (represented for example by an
independence from a Unicode specific version and reduced mapping at protocol
level). This means a better support of specific namespaces that currently
are, can be operated, or are to be built, outside of the area of ICANN's
governance (as prepared by the ICANN ICP-3 document, france at large has
applied through a two-year community test). This in particular includes
non-IDNA presentations, non class "IN" classes.

- RFC 3935 assigns the IETF with the responsibility for a much broader
Internet than the ICANN footprint, i.e.  "A large, heterogeneous collection
of interconnected systems that can be used for communication of many
different types between any interested parties connected to it.  The term
includes both the "core Internet" (ISP networks) and "edge
Internet"(corporate and private networks, often connected via firewalls, NAT
boxes, application layer gateways and similar devices).  The Internet is a
truly global network, reaching into just about every country in the world.
The IETF community wants the Internet to succeed because we believe that the
existence of the Internet, and its influence on economics, communication,
and education, will help us to build a better human society."

  We do not have the time, technical capacity or experimental background now
to tackle this issue at an IETF standardization level. However, the IETF
MUST make it clear that if the resolution of IDNA operational and
architectural issues is delayed and possibly delegated to ICANN's (*) IDN
Guidelines Committee (cf. WG Chair 19 December 2009 mail) or is assumed by
the IUCG open special interest group for an IDNA2010 BCP on the
implementation, transition, and deployment of IDNA2008 ( http://idna2010.org),
some principles must be clearly and consistently spelled and be respected by
these entities because they belong to the IETF rights, duties, and values.
The first of them is that these entities must be open to any interested
individual.

- The WG/IDNABIS Charter states: "It is recognized that some explicit
exceptions may be necessary in any case, but attempts will be made to
minimize these exceptions." At the WG/IDNABIS, we considered the technical
exceptions within the protocol definitions and minimized them. However,
there are other exceptions that may arise in the protocol experimentation,
deployment, and various applications that can lead (as you have yourself
noted) to conflicts on the same user machine, within the same usage, the
same application or the same network governance area. Our mission is to
document how to avoid such technical, security and operational problems or
to list them in the security section with the principles that should be
followed in order to best protect Internet Users.

- The WG/IDNABIS Charter also states: "The constraints of the original IDN
WG still apply to IDNABIS, namely to avoid disturbing the current use and
operation of the domain name system". This is true at every level of the
DNS: this is also true for the first level. This means that this WG must
ensure that its document set and its consequences are not possibly used to
disturb the current use and operation of the domain name system at
architectural, security, and operational levels. This is not the case. The
documents should at least include the mention of these risks, and how they
should be defeated or that further work on the issue is to be carried.

- The IETF is not involved with the "current use and operation of the domain
name system", except for one major exception. This exception is the
management of the IANA registry. This is described in RFC 2860 and RFC 5226
(which obsoletes RFC 2424). There are two main cases involving the IETF's
direct responsibility to the point where it can exercise its ability to
cancel the IETF/ICANN MOU:

----  in case of experimentation by ICANN and/or IETF
----  in case of a conflict between the standard and the way ICANN applies
an IETF standard.

There are non-ICANN entities which may de facto or legally share or assume
the IANA's role to the legitimate benefit of some IDN internet communities;
they will be called IDNO (IDN Organizations).

There are expressed concerns and demonstrated situations where ICANN, or
other IDNO, that the current use and operation of the domain name system
might be disturbed. There are three classes of (possible) situations where
this happens:

- the lack of distribution equilibrium between the kind of potential
difficulties to experiment, analyze, and address that is due to the
limitations imposed by ICANN to participate in its experimentation which is
restricted to non-roman IDNccTLDs. It is to be expected that most of the new
difficulties to experiment will be located

---- at IDNgTLDs along with more innovative uses and the lack of a
governmental authority to sustain their naming policy,
---- at bidi TLDs,
---- and at roman TLD where the orthotypography is not fully supported (as
Latin language TLDs, and especially French).

- These TLDs MUST either be a part of the ICANN experimentation on an equal
footing basis, or an IETF sponsored complementary experimentation should be
organized by adding a batch of experimental TLDs in the root file towards to
warranty an equal chance of access to experimental operations and sales, on
a first come first serve basis (**).

-  the commercial discrimination that is imposed by ICANN’s internal rules
between candidates to IDNg/ccTLD on IDNA and root capacity related technical
grounds (after two years of an official public campaign denying such
limitations and an offer to register gTLD candidates (with 500 candidates
being persistently rumored)). This is in clear contradiction with the WTO
rules and the international agreements concerning the Technical Barriers to
Trade. Due to the RFC 5226 (2424) on the "first come, first serve" principle
in IANA registration and the RFC 2860, the IETF could find itself involved
in a lengthy international commerce case that could affect the credibility
of its international responsibility capacity and further delay IDNs for
years to come.

- new Internet architectural readings, such as "Interplus", which was
introduced by the IUCG, based upon IDNA2008 findings. This WG should add in
the security section that this potential risk has been identified, the IUCG
was requested to document "Interplus like"  architectural frameworks, and it
will rest with ICANN, IETF, and the appropriate IDNO and Internet User
representations to address it.

RFC 3935 assigns the IETF with the goal to make the Internet work better:
"The mission of the IETF is to produce high quality, relevant technical and
engineering documents that influence the way people design, use, and manage
the Internet in such a way as to make the Internet work better.  These
documents include protocol standards, best current practices, and
informational documents of various kinds".

Standards are well defined in RFC 3935: "As used here, the term describes a
specification of a protocol, system behaviour or procedure that has a unique
identifier, and where the IETF has agreed that "if you want to do this
thing, this is the description of how to do it".  It does not imply any
attempt by the IETF to mandate its use, or any attempt to police its usage -
only that "if you say that you are doing  this according to this standard,
do it this way".  The benefit of a standard to the Internet is in
interoperability - that multiple products implementing a standard are able
to work together in order to deliver valuable functions to the Internet's
users."

As a result, this WG should make it clear that any derivative work where
additional neutrality, security, or operational additional constraints that
is based upon privately contracted situations that would not fully respect
"its description of how to do it", including its principles for derivative
work, might not be interoperable and, therefore, should not be attempted by
ICANN and IDNA having entered into a comparable MoU with the IETF. This
restriction is a consequence of the IETF RFC 2860 obligations to properly
advise ICANN.

RFC 3935 also states: "The Internet isn't value-neutral, and neither is the
IETF.  We want the Internet to be useful for communities that share our
commitment to openness and fairness.  We embrace technical concepts such as
decentralized control, edge-user empowerment and sharing of resources,
because those concepts resonate with the core values of the IETF community.
These concepts have little to do with the technology that's possible, and
much to do with the technology that we choose to create."

What is discussed in this very memo shows that IDNA2008 may lead to
important, yet potentially disruptive, innovations that the IETF, ICANN,
IDNOs and Internet Users should collectively evaluate and canalize. IDNA2008
permits misuses that they all should oppose in unison.

Jean-Michel de Portzamparc

(*) The WG/IDNABIS Chair wrote on 2009/12/29 that: "It has been suggested
that a better forum in which to deal with IDNA2003 and IDNA2008
incompatibility is the ICANN IDN Guidelines Committee. That may be a better
forum with broader participation than the IDNABIS working group in which the
TR46 proposal or other proposals may be discussed. If we adopt Cary Karp's
offer, your observations, below, would be input into the Guidelines
committee discussions." For years, none of the participants to this note has
succeeded to participate to the ICANN process on an individual basis or
through an ICANN constituency.

(**) france at large has answered the French Government pending public call for
candidates to manage the ".fr" namespace (This answer has been published as
a book and can be found at
http://www.lulu.com/content/paperback-book/pni/7143574). As part of this
response it noted its interest to manage the ".tf" TLD and to orient it
towards French litterature (Textes Français). It also is one of the
technical sponsors of Project.FRA and a Member of the A-FRA (
http://a-fra.org). The gTLD policy published by ICANN since its Paris
meeting permitted ".FRA" to start proposing French IDNs in parallel with
AFNIC (".fr") even if the French Government continued to procrastinate or
did not retain france at large as the operator of ".fr" or of "tf". The recent
policy change introduced by ICANN, in turn procratisnating about gTLDs,
gives AFNIC (.fr) an unfair advantage in deploying French IDNs. Moreover if
French language orthotypography is not properly supported. Repeated
documented requests and a pending I_D on the way to support them (
http://www.ietf.org/id/draft-iucg-punyplus-03.txt) was systematically
ignored by the Chair and you.

Project.FRA has a three phase deployment plan:
A. free test ULD (user level domain) in Interplus class IU (Internet Users)
B. non-profit non-free ULD zone in class IU
C. non-profit commercial IDNgTLD also in class IN and IU.

Phase A is now in pre-operations. Phase B will be initiated as soon as the
IDNA2010 BCP on the implementation, transition, and deployment of IDNA2008
is completed. Phase C should have been initiated as soon as ".fra" is in the
ICANN Root file as a regular gTLD. Its purpose is to pay for the Phase A and
B voluntary and experimental project.

 ----------------------------------------

At 21:33 01/12/2009, Lisa Dusseault wrote:
One example I discussed with Patrik yesterday, was whether locale
might affect mapping. I'd like to get better insight into the general
understanding of that.

1. Could locale determine whether a PVALID character should be mapped
into another PVALID character prior to following the rules to turn
into an ALABEL?  I believe the consensus answer is probably SHOULD NOT
or MUST NOT because that would make domains with that valid character
unreachable by software using those locale rules.

2. Could locale determine whether, or how, a DISALLOWED character is
mapped into a PVALID character prior to getting an ALABEL?  For
example:
 - in locale Laputa, disallowed character (x230) is not used in the
local language, so it's not mapped, an error occurs so a user seeing
that in a Web page can't reach any actual domain
 - in locale Balnibarbi, (x230) is considered to be the same as O,
so it's mapped to 'o', and a Balnibarbi user reaches domains
containing o's
 - in locale Glubbdubdrig, (x230) is considered to be the
capitalized version of (x231) and a Glubbdubdrig user reaches a
different set of domains containing 's

Note that none of these locale rules would necessarily make domains
containing completely unreachable to their users -- a Web page
containing a link with would be looked up without mapping that
character to another (assuming the conclusion of point 1 above).

If I'm reading between the lines correctly, communication is hampered
between people who are writing under the assumption that this kind of
locale-dependent scenario is going to happen, and people who are
writing under the assumption that this kind of locale-dependent
scenario ought to be forbidden and nobody would try such a crazy
thing.

At 22:04 01/12/2009, Lisa Dusseault wrote:
I don't believe we know what the WG consensus position is around how
strongly pre-lookup mappings are recommended and in what use cases,
and how compatible optional pre-lookup mappings are with IDNA2003
in-protocol mapping.

Agreeing with you to a large extent Vint, I believe there is a strong
consensus in the WG that mappings "aren't part of the protocol",
meaning mapping shouldn't be required in all cases. Certainly,
registries shouldn't be required to automatically map any character
into any other character before registering the mapped domain name:
better to make the domain name holder be explicit about the exact
domain name they are registering.  We have also referred to use cases
involving client software doing lookup of domains that are already
supposed to be valid, in which context an invalid character would
simply be a software error rather than a situation that required
mapping. In this matter, we have consensus to do something different
in IDNA2008 vs IDNA2003.

I believe we also do have consensus in the equivalence of A-label and
U-label forms and transformation in either direction.

After that it gets confusingly nuanced.  We even discussed having a
different term for optional, pre-lookup mapping of user-typed-in
domain names, including links found in HTML typed in by the HTML
author.  I believe we do have consensus in the WG that this type of
helpful mapping is going to be implemented, for example in Web
browsers, as being a user experience that is preferred over a dialog
interruption or error.

So that leaves us with a great deal of progress in IDNA2008, and a lot
that is decided by WG consensus, but still with a gap in our total
consensus.  Some people would prefer that optional, pre-lookup mapping
be wide open.  Others would prefer would that although pre-lookup
mapping should be optional, IF it is done, it MUST be done one way.
There's even room for a middle ground where there might be more than
one recommended standard set of mappings, but they should be done as a
matter of standardization and not as a matter of full implementer
discretion, if at all possible.  There is room for splitting the
mappings territory up per-application: perhaps domains in HTTP URLs
and browser address bars should be mapped one way, but other types of
URLs could follow "no mappings allowed" rules.

Finally, we have the options around transition strategies (such as
bundling recommendations) that might allow us to come to consensus
regarding these optional pre-lookup mapping questions.

I appreciate all the discussion that is leading us to a better
understanding of the issues around those last issues, and developing
consensus regarding those last options.

At 16:06 02/12/2009, Lisa Dusseault wrote:
I'd like to try to unpack some of the different use cases we're
talking about a little more.

ISTM that use cases where the person following the link is the person
who is typing it in, are use cases that locale-dependent mapping might
be most useful.  If I'm in a locale where (x230) is considered to be
the capitalized version of o (ASCII o),  it might very well be most
helpful to make that mapping.  Use cases where the same user is typing
in the domain names that then looks them up include:
 - typing links in the address bar
 - typing mail address in the To field of an email
 - Writing a Web page, blog post or email, wherein I check that the
links work before posting/sending my document

In contrast, the use case where the person looking up the domain
F .example is not the person who typed it in, then in most cases we
no longer know the intent or locale of the person who typed in the
domain.  It may be the same locale as the person who is looking up the
domain but it may not be.  The person who typed in the domain may have
intended f .example or foo.example, and may have tested that before
sending/posting the link, but we no longer have that information.  Use
cases include:
 - Following a HTTP link in any Web page, document, blog post, email, etc
 - Using a mailto link (explicit or implicit), e.g. when one person
sends me another person's email address

We probably would all agree that people follow links while Web
browsing far more often than they type them in, and even when typing
in, auto-complete probably drastically reduces the new cases of
from-scratch mapping and lookup.

However, we probably have quite different assumptions about how much
Internet activity takes place among users of a consistent locale.  Can
we assume that Patrik wants interpreted as because he communicates
mostly in Swedish with Swedish users and mostly reads Swedish Web
pages?  Or must we assume that Patrik also gets email from german and
swiss senders, and also reads Web pages (perhaps in English!) written
by German users who expected different mappingsH[H H\™\[ avily on our model
of a user, and whether we're using ourselves as hypothetical examples or
not.

One slightly more solid question for browsers is, would it be entirely
crazy to have different mapping algorithms for typed-in domain names
than for links followed?  There might be a locale-dependent mapping as
well as a global mapping.  (I assume that having every established
locale mapping installed would be complete craziness.)

Another question is: when posted links are followed, how often do we
know the locale where the link was authored?  Not that the browser
following the link would necessarily be able to apply the mappings of
the locale in which it was authored, but would it be slightly better
to apply a global mapping than a mapping from a different locale?

Do any authoring software clients fix up links as the user types?
When I type a link in a document, the authoring software often makes
that link active.  Is there any software that automatedly lower-cases?
 If so, would such software also be likely to map to PVALID characters
before the doc is finished?

At 20:35 02/12/2009, Lisa Dusseault wrote:
I don't see where that principle begins and ends.  If it were entirely true,
zone operators would be entirely free to register xn--garbage as a domain,
entirely free to register DISALLOWED characters.

Since it is in the purview of IDNA designers to make a character valid or
disallowed,  wouldn't also be in the purview of IDNA designers to say
something like "This character becomes valid on date 01/01/2020", right?  So
why wouldn't it be in the purview of IDNA designers to also say something
like "This character becomes fully valid on date 01/01/2020, but until then,
may be registered if it is bundled with another sequence"?

Doncha love slippery slopes

At 00:15 03/12/2009, Lisa Dusseault wrote:
>>Would someone who understands IETF process better than I do please explain
why the discussion
>> of that character needs to proceed any further?

Even if this is a rhetorical question, I'll bite.  It's because the IETF
makes decisions by rough consensus and running code.  Rough consensus is
among informed participants as well as experts and people in certain
positions of authority or responsibility.  Running code certainly brings in
browser/client implementation history and current client implementation
concerns.  It is not only operators of the countries where those languages
are most spoken, that have collateral effects from the status of the
characters of those languages in IDNA.

At 01:46 03/12/2009, Lisa Dusseault wrote:
Your argument is a fair argument to try to influence the consensus, but it
is not a fair argument to overrule a consensus (not that I'm saying you are
doing that).   We do rely on most IETF participants being constructive most
of the time.  We just have no other process besides appealing to people who
are forming the consensus to look at running code, reality, security
concerns and so on.

At 05:01 03/12/2009, Lisa Dusseault wrote:
I did claim that running code was an important influence on a WG consensus,
by tradition, habit and official encouragement.
 - First, that doesn't mean it's the only influence, or the most important
influence in some considerations.
 - Second, the term is used both strictly, as in to demonstrate
interoperability directly, and also loosely. By loose interpretations of
"running code" you could claim that how a country uses a character in
printed books was a form of deployed code that can be used as factual
evidence.

At 23:41 04/12/2009, Lisa Dusseault wrote:
I've been having trouble understanding how TRANSITIONAL characters that
could not be mapped or looked up by clients would help us out here.  (You're
not the only one Erik, I was just thinking of this offline)

If I understand correctly, Web browsers in particular visit a corpus of Web
pages with links that they would like to not only continue to support, but
also continue to resolve to the same host (absent other changes).  So on the
day that somebody upgrades to the new version of a browser with IDNA2008,
their link destinations for TRANSITIONAL characters do not change.

"Do not change" also means that the user doesn't suddenly start getting an
error trying to follow a link with a TRANSITIONAL character in the domain.

I agree that in some models, an error is better than going to an
indeterminate destination.  But only in some models.  To the user, upgrading
their browser and suddenly having links with in domains fail where it
succeeded the day before, does not seem like a real upgrade.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20091220/683dfb29/attachment-0001.htm