Unicode & IETF
vint at google.com
Tue Aug 12 20:29:06 CEST 2014
and that is likely the crux of our disagreement because some of us,
including me, think that we need precision in IDNs to avoid multiple
encodings that lead users into thinking they have typed what they meant but
they end up in the wrong place because the DNS discovers a different
registered string than the one the user might have intended.
On Tue, Aug 12, 2014 at 1:31 PM, Shawn Steele <Shawn.Steele at microsoft.com>
> (It's strange how the subject migrated to the changed one).
> > the problem is poverty of vocabulary then. I said nothing about "meaning"
> only about encoding and the side effects of having two ways to represent
> the same <character? glyph? thing?>. Unless canonicalization produces only
> one representation, comparisons can fail and create unintended results.
> That's very 'mathematical'. Either A) the system has to ignore certain
> linguistic considerations in favor of mathematical precision, or B) the
> canonicalization has to allow linguistic variation at the expense of
> mathematical certainty. With IDN we sort of have both.
> For the first, since IDNA2003 we've sacrificed some linguistic variation
> for precision. The Turkish I is an obvious example. I don't want to argue
> right/wrong, I'm just pointing out that it is probably what neither a
> Turkish user nor a non-Turkish user would expect for those 4 characters.
> DNS has never really had 'certainty'. For German users before IDN, it's
> unclear without testing whether I need to use a or ae when a word had
> a-umlaut. Now there's an additional form that users can try.
> So, I hypothesize that canonicalization is important in that it provides
> consistent output for the same inputs. We know that it's going to fail
> linguistically, so we need to ensure that it remains consistent. If it
> remains consistent, then ambiguities can be resolved by bundling or
> blocking at the registrar level, or anti-phishing/blacklisting/etc tactics
> at the client level.
> When moving the needle through the gray area between linguistic
> permissiveness and mathematical precision, I would prefer to err on the
> side of allowing people to type the things they think they need to type.
> Idna-update mailing list
> Idna-update at alvestrand.no
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Idna-update