Moving Right Along on the Inclusions Table...

Kenneth Whistler kenw at sybase.com
Thu Dec 21 20:27:36 CET 2006


> >So I think it is functionally much closer to a SPACE than a
> >hyphen, and I don't see a compelling argument for making it
> >an exception for Ethiopic punctuation in the inclusion
> >list.
> 
> See above, you misunderstood me.

No, I don't think I did at all.

> 
> >Unlike the geresh and gershayim for Hebrew, it isn't an essential 
> >component needed to build words in recognizable forms.
> 
> They have something that naturally does the job

And isn't needed to build words in recognizable forms.

Internet identifiers don't *require* word separators, and
if anything the predominant use of the existing "-" has
been to cause trouble and spoofing, rather than to
"do the job".

> >Now I know Daniel Yacob has asked that Ethiopian ":" be
> >treated as a connector for identifiers, as it would be
> >a more naturally readable way for Ethiopians to string
> >together words for multiword identifiers, a la
> >the underscore in C: multiple_word_identifier_example.
> 
> If you think the underscore is a natural thing.

Utterly beside the point. There is nothing "natural" about
it -- it is an artificial convention used by formal
programming languages to define identifiers, in contexts
where SPACE is a formal delimiter in the syntax.

> The point is that the 
> ETHIOPIC WORDSPACE (which is not the COLON character) is already 
> there in Ethiopic script. Why ask them to use "-" or indeed "_"? They 
> have something already that does the job.

Now it you who misunderstand.

> 
> >But I think *that* discussion belongs in the realm of specialized
> >syntax extensions for programming languages, much the way
> >"_" is handled, for example.
> 
> This is wayyyyy out in left field, and has nothing to do with 
> ETHIOPIC WORDSPACE.

Not at all out in left field. It is precisely and appropriately
focussed on the area where it could make a difference. In
Perl or some other programming language context, if "_" doesn't
seem appropriate for stringing together long Ethiopic multiword
identifiers, an addition to the syntax which treats Ethiopic
":" as a parallel connector could work fine.

It isn't needed for *internet identifiers*, however.

> >I think the argument is identical for Ethiopic ":", and stronger, if 
> >anything, because that particular bit of punctuation is confusable 
> >with an important syntax element in URLs.
> 
> I can't see how this could cause any actual difficulty. The colon 
> syntax element occurs in only one position, as in http:// or ftp:// 
> and if someone accidentally put an ETHIOPIC WORDSPACE in there the 
> only thing that would happen is that the browser wouldn't go anywhere.

You are wrong about this. See, for example:

http://www.adobe.com/cfusion/knowledgebase/index.cfm?id=tn_16715

which points out user confusions that result from the fact
that the Mac OS X uses ":" as the directory separator.

Allowing a ":" lookalike into the inclusion set for StringPrep
(which would not *only* be used for NamePrep and domain names,
by the way), is just asking for bad guys to come looking for
ways to exploit its visual similarity to ":", especially since
both usages would be related to syntactic separation, and
users would not have any clear way to distinguish the subtleties
here.

> >"-" is the only exceptional bit of punctuation that gets carried 
> >forward, I think, and it has to be simply because of prior use in 
> >ASCII-based domain names.
> 
> And the question of whether it is right to force that on Ethiopic 
> which has its own delimiter is one which I think it is legitimate to 
> ask.

And is it right to force confusion on IDNA for a common
syntax element to allow emulation of a word separation
convention in Ethiopic which is being dropped even in
languages using the Ethiopic script?

--Ken



More information about the Idna-update mailing list