Symbols and Line-drawing (was: Re: Proposed Charter forthe IDNAbis Working Group)

Martin Duerst duerst at
Fri Mar 28 06:44:13 CET 2008

At 01:41 08/03/28, John C Klensin wrote:

>Let me plead, again, for getting the charter under control and
>then, as appropriate, dig into these kinds of arguments on a
>case-by-case basis.

No problem with that. My mail was in response to Gerv saying
that line-drawings and so on are important in some sense that
he said affected the charter, so I was replying saying that
they are actually not so important. I hoped that that would
help move the charter forward.

In this light, and given that your explanations below don't
propose any change to the charter, I don't think it's relevant
at this time. So just as a suggestion from my side, it may be
helpful if you followed up your pleads with some deeds, leading
with the example of posting less charter-irrelevant material
rather than more.
[If I'm wrong, and the text below is relevant for the
wording of the charter, please correct me.]

Regards,    Martin.

>To review what has been said before, line- (or box-) drawing
>characters are not an ideal example, for the reasons you
>identify, but they are still relevant.  We could also debate,
>probably endlessly, whether the key issue is "easy to type",
>"easy to recognize accurately", "easy to deal with in a file",
>or "easy to describe in a database record".  Each of  those is
>important, perhaps most important, to a different community and
>the difference between the first two is key to a collection of
>confusability issues (including the phishing ones).
>But the reason for banning those characters is a matter of
>protocol design that, for the DNS, goes back to the first "host
>name" rules.  It goes back much further in programming language
>contexts and is reflected in Unicode's "identifier" rules. Even
>though each one of those systems may end up with a slightly
>different character list, the general principle is that there
>are characters that one uses to form identifiers or names and
>characters that are reserved for use as delimiters or as other
>pieces of syntax.  There have been historical exceptions to the
>need to make that distinction, but they have not been successful
>without additional assistance (e.g., Publication Algol doesn't
>work directly as a programming language unless one can figure
>out how to type parts of programs in boldface and even it
>doesn't permit, e.g., spaces in identifiers).
>Banning them also help prevents the future incompatibilities
>that would occur if some group of people improvised forms for a
>few characters with symbols and symbol or other combining marks
>and the characters themselves were included in a future version
>of Unicode.
>That leaves a choice.  One can pick out a list of characters and
>say "these are reserved for special bits of syntax" and permit
>everything else in names/identifiers.  That may be plausible for
>a closed 128 or 256 character repertoire, but it gets a little
>complicated for Unicode.   Doing that often causes one to end up
>with the "sometimes these are delimiters and sometimes they are
>not" situation that helps make it impossible to build a general
>URI parser that always works.   At the other extreme, one can
>say "we permit letters and digits in identifiers, perhaps with a
>few exceptions, and reserve everything else for other purposes".
>That is, ultimately, the programming language approach, the host
>name approach, the Unicode identifier approach, and so on (for a
>rather long list).  
>We are proposing taking that approach with IDNA200X as well.
>The fact that it helps with confusability problems, with
>phishing, with database indexing and label description problems
>(partially a collation issue and partially one that is often
>known around the DNS as the "whois" issue), and others all help
>reinforce the impression that it is a good idea.  
>   john
>Idna-update mailing list
>Idna-update at

#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#       mailto:duerst at     

More information about the Idna-update mailing list