Stop me if I've misunderstood...

Marie-France Berny mfberny at gmail.com
Thu Jul 9 00:16:46 CEST 2009


Gervase,
Mark gives a very fair account of the Unicode sub-problem he faces. The same
as you gave a very fair account of the UI sub-problem you are concerned
about. However, prior to addressing sub-problems we need to address the main
problem. The main problem is that:

- (1) none of the existing tools sets (browsers, protocols, codes,
procedures, applications, IDNA2003, etc. ) is prepared to answer the demands
of the linguistic diversity.
- (2) users (China, Russia, Francophonie, Arabic communities, etc.) have now
the capacity and the will (after 10 years delay) to support their own
aspects of this linguistic diversity by their own. They also distrust/oppose
the ICANN IDNA strategy.

So the issue is simple. We need find the exact level of flexibility in
mapping to avoid the end of the Internet either because it becomes chaos (as
you point it) or because users build their own Internet (like China, etc.).
There is a possible consensus in this WG that this depends on the location
where the _mappings_ apply (plural depending on the 30.000 linguistic
orthotypographies eventually involved). This WG is not chartered to go any
further than _permitting_ this flexibility and seems to have made some good
work with the mapping document. It should be the reasoning basis for further
orthotyporaphy support, language by language. Orthotypography being now
mandatory because of the semantic value given to the DNS, awaiting the
semantic addressing.

The IUCG, under the france at large leadership, motivated by the inability of
the current proposition to correctly support the French orthotypography and
the need to support of the Semantic/Semiotic Internet (Intersem), has
prepared an architectural extension (Interplus) that enacts (not create) the
Internet presentation layer in a possibly reasonable manner (everyone can
use it the way one want). This permits the user, the browser, the
applications to work within a "presentation" without bothering about all the
currently discussed details. The presentation they use is then related to
the rest of the world through a "pseudo-network" layer which make them
believe that they relate with a default Internet using the _whole_ Unicode
set and any additional semiotic coding they may want to use.

Depending on presentation (i.e. character set, language orthotypography,
local environemnt, knowledge metastructure, etc.) there will be application
and parametering at several layers, including protection against phishing,
spamming, etc. However, all this "under the hood" mechanics should be
transparent to the user and certified by an authority (ICANN, IANA, his
Government, ISOC, his user association, you name it) he/she trusts and
others will watch and criticize in order to protect interoperability and
interintelligibility.

The interest of this approach is that it will only require the design and
development of something very comparable to a firewall and absolutely no
change in the existing Internet, that ISP could even deploy on their side of
their Internet connexion. This is something many ISPs round the world, are
already used to - for example to support Chinese names.

I hope this helps.
Marie-France Berny




2009/7/8 Mark Davis ⌛ <mark at macchiato.com>

> That is, in fact, the current outcome, because of two issues:
>
>    1. There are 4 characters that are valid in both IDNA2003 and IDNA2008,
>    but will direct to to different IP addresses. So if you send a friend a URL,
>    he could end up going to a different site, if you have different browsers or
>    different browser versions.
>    2. There is a proposal to add a mapping to IDNAbis that would be "UI
>    only", and optional. This is to handle user-expected variant differences:
>    case, width,... That would also end up with problems with "bus-ability" in
>    that whether a URL gets mapped is left up to the user-agent's choice, and
>    what it thinks qualifies as "UI", and even whether the mapping is changed
>    (the mapping is a SHOULD). And there is no current requirement that the
>    mapping be compatible with IDNA2003, so we get the same problem as #1.
>
> My opinion:
>
> This is a rather bad situation -- for interoperability and security, let
> alone the user experience -- but that people in this group just don't
> realize it yet because they haven't gotten enough feedback from people who
> are concerned with interoperability and security. In particular, I believe
> that:
>
>    1. We should have differences from the current state (IDNA2003) that
>    cause a URL to go to a different site *only* if there is overwhelming
>    justification and little negative impact.
>       1. There is convincing evidence that this divergence is necessary
>       for two characters: ZWJ, and ZWNJ. Fortunately these are extremely low
>       frequency characters in current URLs within web pages, so the negative
>       impact is quite limited.
>       2. There is not overwhelming justification for the two others:
>       es-zett (sharp S) and final sigma. As a matter of fact, the German NIC has
>       come out against the former. We do not have enough involvement from the
>       Greek community to have any real case for the latter. And these are
>       extremely frequent characters in the respective language communities.
>       2. We should have a mandatory mapping applied to all lookup (whether
>    "UI" or not "UI"), whereby for any cases where IDNA2003 also maps, they must
>    have the same result. Failing that, we should not provide any mapping in
>    IDNA.
>
> Mark
>
>
>
> On Wed, Jul 8, 2009 at 13:03, Gervase Markham <gerv at mozilla.org> wrote:
>
>> I must confess that I've not had time recently to follow carefully the
>> discussions about mappings. If that means that some people consign this
>> message to the bit bucket, so be it.
>>
>> At the moment, standard domain names have what I'll call "bus-ability" -
>> that is, if you see them in an advert on the side of a bus, you can
>> write them down, type them into any web browser or other domain
>> name-using client later, and you'll end up at the place intended by the
>> creator of the advertisement. IDN domain names under the current version
>> of IDN have, as far as I understand it, pretty much the same
>> "bus-ability" property. In the IDN case, what the user types has to be
>> first normalized, and then converted to punycode. The user in no way
>> needs to know or care about this extra technical complexity. It just
>> works.
>>
>> I would assert that this property is pretty key to keeping the web
>> working in a sane and, importantly, secure manner. People convert domain
>> names from print/voice/memory to computer and back all the time.
>>
>> If the standards were to change in such a way that it becomes quite
>> legal and conforming that typing a set of characters into browser A
>> takes you to website Q, but typing the same set into browser B takes you
>> to website R, I would politely suggest that those who wrote the new
>> standards had taken leave of their senses. This is a recipe for chaos.
>> And phishing.
>>
>> This incredible outcome is not a serious possibility, is it?
>>
>> Gerv
>> _______________________________________________
>> Idna-update mailing list
>> Idna-update at alvestrand.no
>> http://www.alvestrand.no/mailman/listinfo/idna-update
>>
>
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090709/9e1a54c0/attachment-0001.htm 


More information about the Idna-update mailing list