Touchstones for "Mapping"

Mark Davis mark at
Thu Apr 2 17:36:51 CEST 2009


On Thu, Apr 2, 2009 at 07:45, Vint Cerf <vint at> wrote:
> mark, my point was that under the rules of IDN, you can convert from
> to U-label with assurance of precision. So you can cater to human
> readability. Storing in U-label for doesn't work for non-IDN aware
> applications and might even confuse them. This is a presentation issue
> surely?

(I could write this in hex codes.) Using those hex codes is just a
presentation issue as well; but what is best for storage depends on the

We all agree that A-Labels and U-Labels are equivalent. And clearly you have
to convert to A-Labels before doing a DNS lookup. But up until that point,
it is an open issue as to what makes more sense for particular
implemenations. For data that might be fed directly to an IDNA unaware
application, A-Label is clearly the best. For the database example, on the
other hand, it makes no sense to store an A-Label -- it just makes every
level of interaction more complicated than it need be; a U-Label is best for
storage. It just needs to be converted to an A-Label at some time before a
DNS lookup.

So I think the most we can say in the document is something like:

Implementations SHOULD store A-Labels or U-Labels, and SHOULD NOT store
M-Labels. Where the stored labels are to be channeled unaltered to
IDNA-unaware implementations, the storage SHOULD be A-Labels.

While that needs some wordsmithing, does the direction work for you?

> v
> Vint Cerf
> Google
> 1818 Library Street, Suite 400
> Reston, VA 20190
> 202-370-5637
> vint at
> On Apr 2, 2009, at 10:33 AM, Mark Davis wrote:
>> I think the main storage benefits are human readability. It is much
>> easier to read:
>> href="http://εύβοια.el <http://xn--mxabir3a6f.el>"
>> rather than
>> href="http://xn--mxabir3a6f.el"
>> or in some XML:
>> <url>http://εύβοια.el <http://xn--mxabir3a6f.el></url>
>> rather than
>> <url>http://xn--mxabir3a6f.el</url>
>> But there are other issues: URL's are stored all over the place. If I
>> have one in an SQL database, I want to be able to do a SELECT Data
>> WHERE Url LIKE 'http://εύβοια <http://xn--mxabir3a6f>%' and not '
>> And there are formal problems, because substrings in Unicode space
>> don't match substrings in PunyCode space. that if my URL were
>> "http://εύβοια-ξενοδοχείο.el <http://xn----vlbedmcdb5a7bjigbc9jyd.el>" (a
made up example), then its A-Label
>> form is "http://xn----vlbedmcdb5a7bjigbc9jyd.el". The SELECT of
>> 'http://xn--mxabir3a6f%' would fail. Moreover, Url LIKE
>> 'xn--mxabir3a6f%' can even return false results, strings whose U-Label
>> doesn't start with 'http://εύβοια <http://xn--mxabir3a6f>%'
>> Mark
>> On Thu, Apr 2, 2009 at 05:50, Vint Cerf <vint at> wrote:
>>> Martin,
>>> I continue to be somewhat confused by logic that suggests that storage
>>> benefits from being in the U-label form.   A-labels are almost de facto
>>> normative since they work withIDN-aware and IDN-unaware appllications.
>>> IDN-aware applications should be able to generate the corresponding
>>> for presentation. IdN-unaware applications. Won't even recognize a
>>> domain name as valid IWoild think. Consequently, storage in A-label form
>>> seems the rational choice. If you disagree, it must be because you see a
>>> flaw in the reasoning above. Can you clarify? V
>>> ----- Original Message -----
>>> From: idna-update-bounces at
>>> <idna-update-bounces at>
>>> To: Harald Alvestrand <harald at>
>>> Cc: Andrew Sullivan <ajs at>; idna-update at
>>> <idna-update at>
>>> Sent: Thu Apr 02 03:37:30 2009
>>> Subject: Re: Touchstones for "Mapping"
>>> There are two sides here, the protocol correctness and
>>> the content correctness. By content correctness, I mean
>>> whether the link e.g. goes to the intended page.
>>> Completely impossible to check with punycode, of course.
>>> Regards,   Martin.
>>> On 2009/04/02 16:56, Harald Alvestrand wrote:
>>>> Martin J. Dürst wrote:
>>>>> I very much agree with Harald. We are working on IDNs because we want
>>>>> humans to be able to easily read domain names in their script. Storing
>>>>> them as A-Labels when there is a reasonable chance that humans will
>>>>> have a look at them (e.g. in HTML or XML source, email source,...)
>>>>> is against the very intent of IDNs. Authors are humans, too, even
>>>>> if they work on plain text :-!
>>>> I can argue the other side of the argument for HTML and XML,
>>>> the main thing being that humans who *enter* IDNs in Unicode form
>>>> without the benefit of conformance-enforcing software interfaces will
>>>> just about always get them wrong (due to bizarrities of case,
>>>> compatibility characters and other weirdnesses).
>>>> If they enter A-labels by hand, it's pretty certain they've
>>>> cut-and-pasted them.
>>>>              Harald
>>>> _______________________________________________
>>>> Idna-update mailing list
>>>> Idna-update at
>>> --
>>> #-# Martin J.Dürst, Professor, Aoyama Gakuin University
>>> #-#   mailto:duerst at
>>> _______________________________________________
>>> Idna-update mailing list
>>> Idna-update at
>>> _______________________________________________
>>> Idna-update mailing list
>>> Idna-update at
>> _______________________________________________
>> Idna-update mailing list
>> Idna-update at
-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Idna-update mailing list