Stop me if I've misunderstood...

Kenneth Whistler kenw at sybase.com
Fri Jul 10 01:35:35 CEST 2009


> On Thu, Jul 09, 2009 at 03:26:43PM -0700, Paul Hoffman wrote:
> > At 5:27 PM -0400 7/9/09, Andrew Sullivan wrote:
> > >I'm not sure I agree with the above characterization.  In DNS, case
> > >differences are preserved but not significant for matching.
> > 
> > So, take it to an example where not all characters are ASCII:
> > Éxample.com and éxample.com seems like a good example.
> 
> Well, that case also seems to me to be something the zone operator
> might do reasonably.  There's no risk of the "xample" part being
> changed, so there are just two variants to register -- not such a big
> deal.

Now scale it from an artificial éxample with one non-ASCII
letter, and consider Vietnamese, where nearly every syllable
has an accented-non-ASCII vowel, and where there are also
common non-ASCII consonants. Suddenly you've imposed a
combinatorial explosion burden on the Vietnamese zone operator.
This is a similarly difficult problem for many languages
written with the Latin script -- most except English, in fact.

Or consider that the case-mapping issue impacts *every* character
in a Cyrillic or Greek IDN.

> But note that this gets us into the other sticky issue: what about
> accent-free versions of "the same name" (in this case,
> "example.com").   Should that also be bundled together.

That would be a bona fide example of a potential local
bundling issue for a zone operator.

But casemapping is not at all the same kind of thing. It
is an obligatory *mapping* that will have to be applied
somewhere in this process, or IDNs are utter chaos.

> I think the right answer to that question is, "It depends, and zone
> operators need to come up with policies around this."  But not
> everyone is, plainly, happy with that answer.

You bet. I think it is a total nonstarter to think that
casemapping (or more correctly, casefolding) is an issue
to be left up to zone operator policy.

I thought we were coming to consensus a couple months
ago that casemapping and width mapping (for fullwidth
and halfwidth forms in East Asian character sets) should
properly be considered a part of the protocol, but that
other mappings for backwards compatibility with IDNA 2003
were more marginal and could reasonably be considered as
part of a preprocessing mapping recommendation document.

But it appears that the group has now made a hard right
turn somewhere, and is trying to throw out the baby
with the bathwater, citing a line in the charter as
a justification for a technical decision about best
protocol design, and settling on a position that
mapping should not be in the protocol at all.

Instead, these mappings are abstracted out into mappings-01.txt,
and then instead of any obligatory and well-defined
requirement for interoperability, the whole things is
further qualified with oughts and maybes and applications
may do their own local things.

I consider this a major mess at this point. Sorry.

--Ken

> 
> A



More information about the Idna-update mailing list