mappings-01 and the general procedure

Wil Tan wil at cloudregistry.net
Mon Jul 27 12:21:39 CEST 2009


On Mon, Jul 27, 2009 at 5:52 PM, John C Klensin <klensin at jck.com> wrote:

> On Sun, Jul 26, 2009 at 06:37, Yoshiro YONEYA <yone at jprs.co.jp>
> wrote:
>
> > I'm wondering if the general procedure is applied to FULL STOP
> characters.
> > For example, Unicode string (FW- stands for Full Width)
>
> --On Sunday, July 26, 2009 07:52 -0700 Mark Davis ⌛
> <mark at macchiato.com> wrote:
>
> > I agree that that should be done, but John Klensin was against
> > it. Something about not being able to recognize separate
> > labels (although current technology does it just fine).
>
> Mark, it really is not that "I'm against it".  I've just
> summarized, several times, some rather aggressive feedback we've
> gotten about the issue.  To repeat that summary (in even briefer
> form than before), there is a fairly fundamental DNS requirement
> that one be able to convert labels from and to the external
> (label.label...) form to the internal (length-label,
> length-label,...) one without knowing anything about the labels
> themselves, even if those labels are "just octets" rather than
> anything that is specifically ASCII, LDH, UTF-8, etc.  In other
> words, that conversion has to be able to be performed, not only
> by IDNA-aware applications, but applications that don't make any
> check at all about the contents of the labels.


I have not seen your earlier messages about this issue, and cannot claim to
be following this list religiously so I apologize if I'm reiterating any
beaten-to-death arguments.

I can definitely sympathize about the programming side of getting things
into and out of internal representation. Some questions/observations:

1. I wonder how many domain names stored (as part of URLs in bookmarks, or
hyperlinks in a document) contain ideographic full-stop in the real world.
My feeling is that it is only a concern at keyboard / input method time.

2. The protocol document mostly deals with labels, and leave the combining
and splitting of labels to RFC1034/5. As such, the input to the mapping
steps at lookup time is, of course, a label. Allowing it to be further split
into multiple labels is asking for trouble. However, I wonder if we could
not change the language or have a section in the -protocol doc to say that
application (at time of user input) MAY apply some transformation to the
whole "domain string"?


> As several people have pointed out, if one could somehow find a
> character that is not actually used in the particular script of
> interest and immediately map it at keyboard-> OS time into  a
> period, there would be no problem.  But that character isn't
> going to be on the user's keyboard, so it probably isn't a
> solution to any interesting problem.   The difficulty with these
> other dot-look-alikes is that they are important for other uses
> in the relevant languages/scripts so that the notion of always
> mapping them into something else doesn't work either.
>

Don't understand this one at all, but maybe I'm just missing contexts in
your earlier messages.

=wil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090727/87889b9b/attachment.htm 


More information about the Idna-update mailing list