Moving Right Along on the Inclusions Table...

Kenneth Whistler kenw at sybase.com
Thu Dec 21 02:15:59 CET 2006


> Ethiopic word space, please. It is used as we use hyphens, and the 
> use of hyphen for that purpose is unknown to them.

I disagree on that one. It is basically a word separator,
derived from the even earlier inscription rule "|" separator,
and isn't a hyphen.

And it is largely being replaced in modern Ethiopian printed
materials simply with a SPACE. There are Ethiopian input
methods that let users cycle between a regular Latin-font-based
SPACE, a double-wide Ethiopic font space, and the U+1361 ETHIOPIC
WORDSPACE characters (which for those on this list not familiar
with Ethiopic, looks like a square-dotted ":").

See, e.g.:

http://www.abyssiniacybergateway.net/mule/punct.html

http://www.ethiopians.com/daniel.html

So I think it is functionally much closer to a SPACE than a
hyphen, and I don't see a compelling argument for making it
an exception for Ethiopic punctuation in the inclusion
list. Unlike the geresh and gershayim for Hebrew, it isn't
an essential component needed to build words in recognizable
forms.

Now I know Daniel Yacob has asked that Ethiopian ":" be
treated as a connector for identifiers, as it would be
a more naturally readable way for Ethiopians to string
together words for multiword identifiers, a la
the underscore in C: multiple_word_identifier_example.

But I think *that* discussion belongs in the realm of specialized
syntax extensions for programming languages, much the way
"_" is handled, for example. The fact that the underscore
"_" is commonly used in programming language identifiers
to string together words into single identifiers,
particularly for (formal) languages which don't distinguish
case (and hence make InterCaps impractical), doesn't
automatically mean that "_" gets carried forward into
constructing internet identifiers for Latin (or other
scripts). I think the argument is identical for Ethiopic ":",
and stronger, if anything, because that particular bit of
punctuation is confusable with an important syntax element
in URLs.

"-" is the only exceptional bit of punctuation that gets
carried forward, I think, and it has to be simply because
of prior use in ASCII-based domain names.

--Ken



More information about the Idna-update mailing list