What rules have been used for the current list of codepoints?

Michael Everson everson at evertype.com
Thu Dec 14 11:35:55 CET 2006


At 05:18 -0500 2006-12-14, Vint Cerf wrote:
>I have been following along, perhaps with less understanding than many, but
>I continue to have concerns that we are not always distinguishing that which
>is needed for expressive natural language, and that which is safe, stable,
>and secure for Internet domain names.

Hm. Vint, you've mentioned "expressive natural language" before, so 
let's discuss it. For my part, I assume that the letters used, for 
instance, to write a person's name should be included in the 
permitted list of letters. I would not insist on smart quotes, for 
instance, as they are decorative. (They are essential in typesetting, 
of course.)

There are a few script-specific non-letter characters that need 
inclusion. The Geresh is needed to indicate certain types of 
abbreviation in the Hebrew language, and is also used to mark certain 
consonants in Ladino. It's not really "optional" therefore. The 
Ethiopic wordspace is used by them in the same way we use a hyphen to 
separate words with a visible mark (as in vint-cerf.com). They do not 
use the hyphen and it is an alien thing to impose on them. On the 
other hand, it can ONLY occur between two Ethiopic letters, so it 
should not be problematic.

Does this help?
-- 
Michael Everson * http://www.evertype.com


More information about the Idna-update mailing list