What rules have been used for the current list of codepoints?
Michael Everson
everson at evertype.com
Thu Dec 14 11:35:55 CET 2006
At 05:18 -0500 2006-12-14, Vint Cerf wrote:
>I have been following along, perhaps with less understanding than many, but
>I continue to have concerns that we are not always distinguishing that which
>is needed for expressive natural language, and that which is safe, stable,
>and secure for Internet domain names.
Hm. Vint, you've mentioned "expressive natural language" before, so
let's discuss it. For my part, I assume that the letters used, for
instance, to write a person's name should be included in the
permitted list of letters. I would not insist on smart quotes, for
instance, as they are decorative. (They are essential in typesetting,
of course.)
There are a few script-specific non-letter characters that need
inclusion. The Geresh is needed to indicate certain types of
abbreviation in the Hebrew language, and is also used to mark certain
consonants in Ladino. It's not really "optional" therefore. The
Ethiopic wordspace is used by them in the same way we use a hyphen to
separate words with a visible mark (as in vint-cerf.com). They do not
use the hyphen and it is an alien thing to impose on them. On the
other hand, it can ONLY occur between two Ethiopic letters, so it
should not be problematic.
Does this help?
--
Michael Everson * http://www.evertype.com
More information about the Idna-update
mailing list