Stop me if I've misunderstood...

Gervase Markham gerv at mozilla.org
Thu Jul 9 12:54:57 CEST 2009


On 08/07/09 22:09, Mark Davis ⌛ wrote:
>    1. There are 4 characters that are valid in both IDNA2003 and
>       IDNA2008, but will direct to to different IP addresses. So if you
>       send a friend a URL, he could end up going to a different site, if
>       you have different browsers or different browser versions.

If it were just four characters, and there are only two possibilities 
for each, that could probably be coped with by registry bundling or 
similar techniques. So I don't see this as a show-stopper. It's just one 
of those things you get when you fix something which is broken. Stuff 
gets messy around the edges.

>    2. There is a proposal to add a mapping to IDNAbis that would be "UI
>       only", and optional. This is to handle user-expected variant
>       differences: case, width,... That would also end up with problems
>       with "bus-ability" in that whether a URL gets mapped is left up to
>       the user-agent's choice, and what it thinks qualifies as "UI", and
>       even whether the mapping is changed (the mapping is a SHOULD). And
>       there is no current requirement that the mapping be compatible
>       with IDNA2003, so we get the same problem as #1.

Can someone summarize the problems there are with retaining exactly the 
same mapping algorithm which is used in IDNA2003? Is it not flexible 
enough to deal with new characters in Unicode 5.1 or something?

> This is a rather bad situation -- for interoperability and security, let
> alone the user experience -- but that people in this group just don't
> realize it yet because they haven't gotten enough feedback from people
> who are concerned with interoperability and security.

I suspected that might be the case, hence my intervention :-)

If nothing else, the idea that the Mozilla project would need to develop 
such a mapping layer without any written guidance from a group which is 
supposed to contain a collection of expertise on language and the DNS 
seems to me to be perverse.

The browser manufacturers would, I can fairly confidently state, be very 
keen to make this interoperable. But one browser maker might bow to 
pressure from one group for a change and another might not; divergence 
and misery are inevitable.

>    1. We should have differences from the current state (IDNA2003) that
>       cause a URL to go to a different site *only* if there is
>       overwhelming justification and little negative impact.
>          1. There is convincing evidence that this divergence is
>             necessary for two characters: ZWJ, and ZWNJ. Fortunately
>             these are extremely low frequency characters in current URLs
>             within web pages, so the negative impact is quite limited.
>          2. There is not overwhelming justification for the two others:
>             es-zett (sharp S) and final sigma. As a matter of fact, the
>             German NIC has come out against the former. We do not have
>             enough involvement from the Greek community to have any real
>             case for the latter. And these are extremely frequent
>             characters in the respective language communities.

I'll take your word for it on #1 - I have no superior knowledge. On #2, 
the obvious course to me seems to be to present the dilemma to the 
various language communities and let them choose. If the Germans don't 
want a change for eszett, then we're done. The fact that we've had all 
these arguments over many months about final sigma and yet you feel led 
to say that "we do not have enough involvement from the Greek community" 
to know what they think also, to me, is incredible. Has no-one thought 
to ask them? Are they ignoring us?

>    2. We should have a mandatory mapping applied to all lookup (whether
>       "UI" or not "UI"), whereby for any cases where IDNA2003 also maps,
>       they must have the same result. Failing that, we should not
>       provide any mapping in IDNA.

This leads me to ask the same question as above: what problems are there 
with this approach that I'm not seeing?

Gerv


More information about the Idna-update mailing list