Comments on draft-ietf-idnabis-defs-10

John C Klensin klensin at jck.com
Thu Sep 3 10:53:33 CEST 2009



--On Thursday, September 03, 2009 04:15 +0200 Elisabeth
Blanconil <eblanconil at gmail.com> wrote:

> John,
> 
> this depends on what this WG calls punycode. I tend to consider
> punycode as the interfacing function between U-label and
> A-label. After two years that function is like a groggy boxer
> stripped with patches everywhere including on the eyes, so one
> does not really know what it may see.

Your metaphors notwithstanding, the specs rather precisely
define and use the term "Punycode" to refer to the algorithm
(and, when needed, specifically to its encoding or decoding
operations) that is, in turn, defined in RFC 3492.  See Section
2.3.4 of Definitions, which says that in so many words.

Unless something has inadvertently crept in --in which case I'd
hope it would be found and fixed-- the term is used only in that
way: there is no such thing as a "punycode string" or "punycode
form".  And as such a reference, the name of the algorithm
appears only as a proper noun (and with a leading capital letter
as a result).

I've just checked the current versions of Protocol and Defs; the
only instance I found that was the slightest big vague on the
subject has been fixed.

> I really like your idea of giving us an IETF/LC break with new
> people and fresh points of view. I am very worried by the fact
> that what Wil brought up was obvious to us from the very
> beginning, what explains some of our misunderstandings with the
> rest of the WG. What if there are other jokers like that? The
> IETF/LC will help spotting them.
 
> For example, I am very suspicious about the U-Label
> restriction to lowercases once the intermediary ASCII string
> has been lowercased. My maths are not good but I know that an
> n character string has only one way to be lowercased and
> something power n (or the other way around) ways to be
> partially uppercased. Since most of the TMs, fun and tricks
> play on cases, as does semantic, I do not see that restriction
> to stand for a long. (There are at least two ways to get
> Uppercased U-label from lowercase ASCII strings, probably
> people and crooks will find more).

There is no such thing as an "Uppercased U-label" under the
specs.  The procedure and definitions for U-labels lead to
nothing but lower case.   Morever, there is no way to get from
an ASCII string (uppercase, lowercase, or mixed) to a U-label at
all: U-labels are defined (after some discussion in the WG) as
containing at least one non-ASCII character.

If you believe otherwise, please identify specific text or
explain your theory as to how an "Uppercased U-label" can exist.

These documents, with the arguable exception of Rationale,
really are protocol specifications and must be read carefully.
If you are unable or unwilling to perform that type of reading,
I suggest that you wait for the tutorial.

> Also, we actually did nothing about phishing.

"nothing" is a significant exaggeration.  The precise
definitions of U-label (and, to some extent, A-label) are
themselves a help.  The move toward requirements for canonical
forms is a help.  Some of us would argue that clear elimination
of symbols and punctuation from domain labels is helpful
(although there is controversy about how much).   Of course, we
haven't "solved" phishing, but we never expected to do so.
There is, as far as I know, only one solution to phishing, or at
least the vast majority of it, it lies far outside the IETF's
scope, and the communities involved have decided, however
inadvertently in some cases, that they prefer other choices to
eliminating (or even vastly reducing) phishing.

     john









More information about the Idna-update mailing list