Mappings - some examples
patrik at frobbit.se
Mon Nov 30 15:27:22 CET 2009
On 30 nov 2009, at 15.13, Alexander Mayrhofer wrote:
>> Can you give evidence for this space being "heavily
>> contaminated" by for example sending links to images on the
>> web where the ß is in use in domain names?
> You know very well that we don't know the level of contamination, because it's all hidden by the client right now.
Yup, but, that was also why I wanted to know whether you had new data as you wrote "heavily contaminated" and not only "contaminated".
> I agree, it's probably not as bad as swapping the well known port assignments of http and smtp ;), but the scary thing about this change to deployed namespace is that we simply don't know.
Agreed. But for me that is a different thing (sorry, should have made that clear).
> The only numbers i could gather was Erik van Poel's statistics from Google's inventory of crawled host names - he came up with 0.00001% of the domain names containing an "ß" (compared to 0.00122% for "ü").
> That could mean two things:
> - Either there's really not just that much "web-contamination" (but how much is there elsewhere?)
My guess (he he he, we continue to guess here) is that the "web contamination" is higher than for example "email contamination".
I do now (compared to a year ago) see web-URLs live in Sweden (on TV ads for example) that do contain the Swedish characters, but so far not one single email address with Swedish characters in the domain part.
> - Or nobody is interested in using it anyway (because even though it works right now, nobody is doing it...)
I have a third alternative:
- People do understand ß is mapped to "ss" so they use "ss" in the published URLs.
This last is my "hope", and by making "ß" PVALID, it would be possible for those parties to use the character in the domain names.
The real hidden question is how many people have "ss" in their URLs while they use ß on their keyboards? No searches in Google or elsewhere can say how much that is in use. And this is, if I understand things correctly, both what Mark is worried about, and what me and Harald see as a good thing. That if ß is PVALID, then the domain name holder can decide (modulo the registry policy) whether a typed ß should result in a successful lookup of "ss".
>> So if 'ä' would have been mapped in IDNA2003, and I now would
>> have been asked if I thought 'ä' should be introduced, I
>> would say "go for it, but speed up!!!".
> Again, I think it's a *very* significant difference whether you open up a new part of a namespace, or you re-define the properties of existing namespace - re-definition is risky. Particularly if such a re-definition is combined with the effects of potentially incompatible mappings...
Agreed, but I think where we disagree here is "how much" this is "opening up a namespace" and how much is a "redefinition". I see it as opening the namespace (same as when we started to allow 'ä' in .SE, while previously people have registered domain names with just 'a') while you see it as a redefinition.
I would though be more "on your side" if the number of domain names that contained ß where say 100 times higher than today in published documents. Because then people would be TOLD to type in something (ß) that mapped to something else (ss) that was registered. That, I claim, is not the case. At least not "heavily".
More information about the Idna-update