Stop me if I've misunderstood...
Mark Andrews
marka at isc.org
Sat Jul 11 02:53:10 CEST 2009
In message <66CF7A3448D1F681D6779939 at PST.JCK.COM>, John C Klensin writes:
> --On Thursday, July 09, 2009 21:13 +0000 Shawn Steele
> <Shawn.Steele at microsoft.com> wrote:
>
> >
> >> I don't think IDNA2008, with or without the most recent
> >> proposals, changes that property. The main thing IDNA2008
> >> does that is different from IDNA2003 is to strongly
> >> discourage any string that requires mapping from those
> >> adverts.
> >
> > That's not gonna happen. Burger King isn't going to write
> > "haveityourway.com" on the side of the bus, it's gonna be
> > "HaveItYourWay.com". Sure, mapping in ASCII is free, but
> > there's a need for mapping in non-ASCII contexts as well.
> > Specifying or recommending something we know is going to be
> > ignored is bad. A) it encourages people to interpret the
> > standard how they see fit, and B) developers can't count on
> > the language because they know it'll be ignored.
>
> And you are reasoning from analogies that may not hold up.
> Before I try to explain that (following up part of Elizabeth's
> note), I want to stress that...
>
> First of all, our role here is to make things work well and
> predictably, with catering to the inclinations of various
> marketing and branding departments (Burger King or otherwise), a
> secondary goal at best. For whatever it is worth, we get more
> predictability when we have fewer variations in what is
> possible. Probably we all believe the latter, the question is
> how it should properly interact with the user experience. That
> is not an easy question and I don't think that hyperbole (from
> either side) or games about who has to prove what moves us
> forward.
>
> Second, my experience with marketing people is that, while they
> would like a perfect world in which every campaign was
> successful, competitors were stupid and ineffective, and will
> make all sorts of demands in the hope of realizing one or both,
> they are ultimately very pragmatic. If one is faced with a
> choice between "haveityourway.com" or "have-it-your-way.com",
> either of which work 100% of the time, and "HaveItYourWay.com"
> that works only fairly often, I know that they --or at least the
> subset who expect to survive in the business-- will pick one of
> the first two. I note that, as far as the DNS is concerned
> "Have It Your Way.com" is a perfectly valid domain name.
But not a valid host name.
> While ICANN rules prohibit names with embedded blanks at the
> second level, just as they prohibit raw, non-ASCII, UTF-8, I
> assume that I'm not the only one here who has had to listen to
> some marketing type complain that "Our Favorite Slogan" could
> not be used as a domain name and that "we" had to be smart
> enough to make it happen and just weren't trying hard enough.
> The response is to explain that
>
> Our Favorite Slogan.MyCompany.com
Our\032Favorite\032Slogan.MyCompany.com
is a domain name as is
Our-Favorite-Slogan.MyCompany.com
The latter is also a valid hostname whereas the former is not.
> is actually a valid domain name that they are welcome to use if
> they like, they just wouldn't find that it was very useful in
> practice. Each time I've had that conversation --and there
> have been several times-- there has been much complaining but,
> eventually, there has been no insistence on domain names with
> embedded spaces.
>
> Now, coming back to your example, we have to realize how
> culturally- and historically-sensitive this is. A decision was
> made in the early 1970s that names of hosts and networks were
> going to be treated case-insensitively. At the time, that
> decision had very little to do with user experiences: we had
> hosts that really couldn't handle lower case, hosts that could
> but treated the two cases as globally equivalent, and hosts that
> were case-sensitive but on which upper case was considered a
> little strange. Case-insensitive identifiers seemed to be the
> way to go. A decade later, that decision was carried forward
> into the DNS world without, if I recall, a lot of thought or
> discussion, largely because, by then, it had been embedded into
> a number of application protocols. Had the original decision
> been made differently -- either to treat identifiers as
> case-sensitive or to prohibit one case or the other-- we
> probably would be having a different discussion today (not
> necessarily an easier one, but different).
>
> Second, the way in which one gets the equivalent of
> "HaveItYourWay" in German is traditionally to make a new word,
> "haveityourway", with no capital letters in the middle. If one
> wants to maintain distinct word-components, one uses spaces or
> maybe hyphens. There are, in principle, two ways to do it in
> Arabic -- the use of initial-form and final-form characters to
> denote boundaries or the use of ZWNJ. But we've been told by
> Unicode experts that initial, final, isolated, and medial forms
> should all match and the Arabic language community has been
> reasonably clear that they do not want or need ZWNJ for writing
> the Arabic language.
>
> So I wouldn't generalize much from "Have It Your Way" (with or
> without spaces).
>
> > I'm not saying that the U-label form shouldn't be encouraged
> > in the bowels of the system, that'd clearly be good. I am
> > saying that anything potentially user facing shouldn't have
> > this recommendation. Especially if "marketing" is going to
> > have a voice ;-)
>
> I think maybe we agree, but I'm not sure which "this
> recommendation" you are referring to partially because, as you
> and others have pointed out, "user facing" is not itself
> unambiguous.
>
> Because of the greater distinguishability of lower case
> characters and because having reverse-mapping work out, I would
> tend to recommend that those who are more worried about
> precision and avoidance of attacks based on recognition of
> characters stick with U-labels and hence with lower case. I
> would not require that in UIs, but I would probably recommend it
> to both advertisers and users.
>
> Where the design questions get controversial, and despite many
> concerns, I'd encourage people who are designing highly
> localized UIs to consider forgoing case mapping (and to present
> lower case) where the community involved was extra-vunerable to
> confusion in scripts with which they were not familiar enough to
> easily do the case conversions without looking (e.g., "Q" and
> "q" may look alike to you or me, but, to someone with very low
> familiarly with Latin scripts and fonts, "Q" might look a lot
> more like "o" than it does like "q"). That is a tradeoff with
> the principle that anyone who types a given string should get
> the same interpretation as anyone else who types that string,
> but it may be worth pointing out that, if familiarity with Latin
> characters is low enough for my suggestion to apply, there
> probably are no Latin characters on the keyboard, so the same
> string is _not_ being typed as would be typed by someone with a
> Latin-based keyboard. I don't think we should be trying to
> make the decisions involved in this, or in forcing one
> particular UI behavior, in the protocol -- partially because I'm
> convinced that, after a few bad experiences, we will find UI
> software ignoring any rules we write in favor of protecting
> users (either by reducing the amount of mapping that is done or
> by insisting on user entry of A-labels for labels in unfamiliar
> scripts.
>
> To turn that same comment around, I'd think that the designers
> of any localized UI that is expected to be used in locales with
> Latin-based scripts, or scripts that have variant-width
> characters in Unicode, would be nuts not to make the obvious
> mappings. Clearly the spec permits that.
>
> If you can suggest a better way to make this clear, I'm
> listening and I assume that Pete and Paul are too.
>
> john
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: marka at isc.org
More information about the Idna-update
mailing list