Unicode position on local mapping

JFC Morfin jefsey at jefsey.com
Wed Feb 18 15:56:05 CET 2009


Dear John,Sorry, but I am travelling to and have a poor vision of my screen.
I hope that this mail will make sense as I can hardly edit my text.

The point is that you assume that characters bijectively fold. This is not
the case. "é","è", "ê", "ë" and "e" fold the same: as the same unicode "E".

2009/2/18 John C Klensin <klensin at jck.com>:
>> --On Wednesday, February 18, 2009 03:30 +0100 JFC Morfin
> <jefsey at jefsey.com> wrote:
>> At 01:10 18/02/2009, John C Klensin wrote:
>>
>> Dear John,
>> We are not talking here of mapping on the registration side,
>> but in the naming policy; to be voted by the ".fr" ccTLD BoD -
>> or the French Parliament, since ".fr" is legally delegated as
>> a public service.
>>
>> The actual way this will be implemented will then be a
>> different question.
>>
>> The position can either be for the Gov to publish an RFP
>> calling for an alternative solution to IDNA, or for users to
>> develop it and use it with broad support. I think the
>> consensus will be pragmatic: "confusion starts from the very
>> mistyping".
>
> I was writing from the standpoint of the implementation
> pragmatics, rather than from that of the process by which the
> decision would be made, so we had a slight miscommunication.
> However, I am also assuming that France would be unlikely to cut
> itself off from the global Internet by adopting a protocol
> "alternative to IDNA" that would require specialized
> implementations of browsers and other application software and
> perhaps specialized implementations of the DNS --
> implementations that would not interoperate in predictable ways
> with the software used elsewhere in the world.
>
> But your explanation makes part of the point I was trying to
> make to Mark and Andrew:  if something we do constrains an
> implementation past the point that the implementers (or relevant
> policy-makers) consider acceptable, we will see "solutions" that
> cause far more "massive" interoperability problems than a mere
> few mismatched characters, even if those lead to false
> positives.  If we treat a pair of characters that might be
> considered the same as distinct,

They are distinct as lower cases, the same symbol (not the same character)
as upper cases.
The whole internationalization doctrine has this problem of not
differentiating between the symbol and the character.

> then the registry (and policy
> makers) have all three possible options: using matching
> registrations to treat the pair the same way, using variant
> blocking to be sure that only one is registered, or treating
> labels containing the two forms as distinct.   It is rationale
> to hope that set of choices gives the decision-makers enough
> flexibility to avoid their deciding that there is no reasonable
> choice other than to, e.g., creating a DNS variation that would
> do server-side matching in a way precisely adapted to French
> needs.  By contrast, if we did the mapping and made a choice
> that was unacceptable, the odds of the sort of RFP you describe
> are obviously increased.
>
>>> > and another possible meaning, which is "expanding to match
>>> > other characters, and registering those _too_?  For instance
>>> > the example that Jefsey provided is just école.fr<http://xn--cole-9oa.fr>and
>>> > ecole.fr, which could easily be resolved by registering
>>> XÀole.fr also whenever xn--cole-9oa.fr is registered.
>>
>> Unfortunately, this proposed procedural patch
>>
>> 1. does not make sense since ecole.fr is already registered
>> (actually in this case reserved) and a decision MUST be taken.
>> As for probably a million of other accentuated domain names).
>
> Without expressing a position as to whether it would be wise or
> not, practical or not, this is consistent with the reasons why
> many domains have created sunrise or similar mechanisms to aid
> with the introduction of new facilities such as IDNs.  And
> decisions about the relationship between IDNs that contain
> decorated characters and the base label string (with only
> undecorated ones) will have to be taken in every domain that
> uses Latin characters and introduces IDNs.

This decision is simple: unicode or not unicode, the language or the
keyboard, internationalisation or multilingualisation. The problem is not
related to IDNA but to decades of an erroneous internationalisation
strategy.

However, the solution is not simple. Yet to force people against their right
and will is not a good solution because it is not stable.

>> 2. is not legally acceptable:
>>
>> --- "école" means school;
>> --- "ecole" can be a TM: ex. http://www.defl.ca/fr/ecole.html.
>> For other terms the accentuated and the non-accentuated terms
>> are different words or TMs.
>
> Here is where you confuse me, or perhaps we confuse each other.
> Because of its exact-match properties, the DNS (with or without
> IDNs) is notably unsuited to a "do what I mean" function.  I am
> probably misunderstanding you, but it appears from the above
> that you are expecting a system that will cause the same pair of
> labels to be treated as matching under some circumstances and as
> not-matching in others.

In your terms, yes. In French terms, no.

The problem is that "your" approach is based on internationlization (i.e.
internationalising English ASCII as a reference) not on multilingulalization
(each language/script being its own reference).

I do not claim there is not an internationalization impossibility, only that
so far it was not found. And I fear that when looking for a solution French
engineers, lead users and users will look more for a multilingualization
than an internationalization solution. This would technically divide the
Internet. Giving the French solution the ability to support
multilateralization as a general feature. We both know the outcome: that
French internet would have a presentation layer.

>I don't know how to do that in the DNS
> or in any conceivable IDN protocol that rests on top of it.

This is the question to be answered. I do not have a global solution either,
and this is why I suppose that no-case folding might be the most acceptable
patch. But the decision of accepting that patch is not mine: it belongs to
the market and to the French national communities.

>> --- "Ecole" will usually mean a national school.
>> --- "schule", "scola", etc. are valid terms from French
>> languages that would not suffer the constraint. This would
>> create an anti-competitive  commercial image imballance.
>> --- There is "ecole.gouv.fr" which will oblige to a documented
>> legal terminology decision.
>
>> 3. would violates the French language and people's equality. I
>> do not see it being legally and technically accepted just
>> because the IETF did not find a solution.
>
> I don't understand how one can have a reasonable expectation of
> both "A" and "not-A" being true at the same time, which your
> comment appears to require unless the situation is to be treated
> as unfair and anticompetitive.

The problem lies with case-folding being accepted in the DNS technology
while the impact of case-folding has not been considered enough.

> That would make simple
> first-order logic unfair.  Perhaps it is, but I don't know quite
> what to do about it.  Even without IDNs, if I obtain a label in
> a particular domain that you think you would like because it
> bears some relation to your business, you may reasonably believe
> that my having it is anticompetitive.

We misunderstand. What is anti-competitive is that people not born with
accentuated names or whose name is not challenged by accentuated names have
not a problem others have.

>But, if you were able to
> take it away from me, I would probably consider that act
> anticompetitive for the same reason.  There is no way to win in
> such a competition.

Yes there is at least one : not to create an artificial problem in forcing a
language to follow the rules of another one. Please remember that this
problem is not related to IDNA but to a wrong doctrine
(internationalization) and to the resulting rough AZERTY standard (one could
imagine solutions, but that would call on a wide debate of the
francophonie).

> Instead, societies invent rules that permit
> one entity to use the name and the other to not do so or perhaps
> for neither to use the name.   This is really no different, or
> so it appears to me.

Yes. But that rule has to be proposed, debated, voted, and accepted. Before
doing just that, people will want to know about the reasons why, about a
plan B, and give a try finding a solution.

I do not think that "because of Unicode, AZERTY keyboard, and IETF" will be
accepted as a good reason enough. I may be wrong, but I would be surprised
this would not lead to a de facto francophone internet replication. Moreover
that DNS is at the boarder between Internet and Intersem (semantic network).
I hardly see French speakers removing accentuated terms from French (and the
same for many other languages) just to please the ASCII computers ?

Best
jfc

PS. You seem to imply that I would consider challenging IDNA. I recall you
that my whole effort is precisely the opposite. To build a better
architecture to include and enforce IDNA (and to extend it when
needed/possible). This is why I urge this WG to proceed, because we need the
IDNA solution to finally settle before we can build around/on top of it and
stay fully interoperable.

This results from this WG which plainly answered that it did not intend to
develop the solution we need. And this is why I made located within the IETF
the lead user architecture that could consider that extension, based upon
some different premises than this WG. After the Chair of this WG said it
would be plain IETF spirit.

I recall you that I wrote a Draft on this (the system currently does not
accept) to investigate how this user approach of mine would be the best way
to deploy the IETF propositions such as IDNA and IPv6.
http://iucg.org/drafts/draft-iucg-innov-dep-strat-00.txt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090218/069a4d2e/attachment-0001.htm 


More information about the Idna-update mailing list