"Martin J. Dürst"
duerst at it.aoyama.ac.jp
Tue Dec 22 11:41:45 CET 2009
On 2009/12/22 11:28, Kenneth Whistler wrote:
>> So perhaps the sensible thing is to engage on the question
>> of mappings that seem to make sense on an ongoing basis
>> and those that really ought to end at some point (examples of
>> these would be sharp-s and final sigma)?
> We've already agreed we wouldn't be mapping those special
> cases, so as not to otherwise upset the consensus here.
> Those are part of the "deviation" class in the IDNAMapping.txt
> table, and the suggestion Michel made would be to *NOT* map
> those. So they are off the table, anyway.
> The problem is the other 3807 mappings where the behavior
> of IDNA 2003 is inconsistent with the generic suggestions
> in the current mapping document. And no, I don't think it
> is sensible to start trying to talk about all of them
> on a case-by-case basis, to determine whether they
> should or should not be mapped.
Doing this one character at a time is definitely not an option,
but that's not what the current mapping document is doing.
> Either we make no recommendation about them at all.
> Or we recommend a maximally IDNA-2003 compatible mapping.
> I don't see any value in spending more years trying to parse
> out some middle ground here, arguing about what is really a
> big pile of compatiblity crap in the standard for interoperating
> with largely defunct old charsets anyway.
I agree in principle with your characterization of "compatibility crap".
But that characterization would entail a limitation of mapping strictly
to NFC and lowercasing. Is that what you mean above by "no
recommendation at all"?
Anyway, while I agree in principle, practice is unfortunately not as
easy. Living in Japan, I have first-hand experience of input methods
generating full-width Latin characters most of the time. And these
full-width characters were the main if not only real reason for going
with NFKC for IDNA 2003. So in my eyes, the actual mapping proposed in
the mapping draft is a very simple cut of the NFKC mappings in two
halfes: Those that seem (very unfortunately) needed and those that are
not needed. If we have a solution, why not go for it?
>> could that be a part of the effort proposed by Cary Karp?
> I would like to see a transition effort focussed on the
> important characters that people have a stake in -- notably
> the sharp-s and the final sigma, rather than deep-ending
> on a case-by-case examination of all the rest of this
> stuff. I suspect that guidelines for registries are going
> to be much more focussed on what to do with all the
> rest of the useful PVALID characters in the first place.
I agree. Indeed the main job for registries is to figure out which of
the PVALID (and context) characters, and in what combination, they
permit. Unless they happen to at the same time be in the business of DNS
resolving software (browsers and all kind of other stuff), they have no
influence whatsoever on mapping.
Also, with respect to transition issues, ß and ς are much more important
because if done wrong, they may lead to the same domain name being
resolved to different IP addresses. Removing some compatibility
mappings, on the other hand, only leads to a domain no longer being
resolved, which can always happen on the network. On top of that, the
registries can (and should) actually do something with respect to ß and
ς (sunrise/bundling/... as discussed previously), whereas they cannot do
anything with respect to mappings that go away.
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp mailto:duerst at it.aoyama.ac.jp
More information about the Idna-update