Moving Right Along on the Inclusions Table...
Kenneth Whistler
kenw at sybase.com
Thu Dec 21 20:27:36 CET 2006
> >So I think it is functionally much closer to a SPACE than a
> >hyphen, and I don't see a compelling argument for making it
> >an exception for Ethiopic punctuation in the inclusion
> >list.
>
> See above, you misunderstood me.
No, I don't think I did at all.
>
> >Unlike the geresh and gershayim for Hebrew, it isn't an essential
> >component needed to build words in recognizable forms.
>
> They have something that naturally does the job
And isn't needed to build words in recognizable forms.
Internet identifiers don't *require* word separators, and
if anything the predominant use of the existing "-" has
been to cause trouble and spoofing, rather than to
"do the job".
> >Now I know Daniel Yacob has asked that Ethiopian ":" be
> >treated as a connector for identifiers, as it would be
> >a more naturally readable way for Ethiopians to string
> >together words for multiword identifiers, a la
> >the underscore in C: multiple_word_identifier_example.
>
> If you think the underscore is a natural thing.
Utterly beside the point. There is nothing "natural" about
it -- it is an artificial convention used by formal
programming languages to define identifiers, in contexts
where SPACE is a formal delimiter in the syntax.
> The point is that the
> ETHIOPIC WORDSPACE (which is not the COLON character) is already
> there in Ethiopic script. Why ask them to use "-" or indeed "_"? They
> have something already that does the job.
Now it you who misunderstand.
>
> >But I think *that* discussion belongs in the realm of specialized
> >syntax extensions for programming languages, much the way
> >"_" is handled, for example.
>
> This is wayyyyy out in left field, and has nothing to do with
> ETHIOPIC WORDSPACE.
Not at all out in left field. It is precisely and appropriately
focussed on the area where it could make a difference. In
Perl or some other programming language context, if "_" doesn't
seem appropriate for stringing together long Ethiopic multiword
identifiers, an addition to the syntax which treats Ethiopic
":" as a parallel connector could work fine.
It isn't needed for *internet identifiers*, however.
> >I think the argument is identical for Ethiopic ":", and stronger, if
> >anything, because that particular bit of punctuation is confusable
> >with an important syntax element in URLs.
>
> I can't see how this could cause any actual difficulty. The colon
> syntax element occurs in only one position, as in http:// or ftp://
> and if someone accidentally put an ETHIOPIC WORDSPACE in there the
> only thing that would happen is that the browser wouldn't go anywhere.
You are wrong about this. See, for example:
http://www.adobe.com/cfusion/knowledgebase/index.cfm?id=tn_16715
which points out user confusions that result from the fact
that the Mac OS X uses ":" as the directory separator.
Allowing a ":" lookalike into the inclusion set for StringPrep
(which would not *only* be used for NamePrep and domain names,
by the way), is just asking for bad guys to come looking for
ways to exploit its visual similarity to ":", especially since
both usages would be related to syntactic separation, and
users would not have any clear way to distinguish the subtleties
here.
> >"-" is the only exceptional bit of punctuation that gets carried
> >forward, I think, and it has to be simply because of prior use in
> >ASCII-based domain names.
>
> And the question of whether it is right to force that on Ethiopic
> which has its own delimiter is one which I think it is legitimate to
> ask.
And is it right to force confusion on IDNA for a common
syntax element to allow emulation of a word separation
convention in Ethiopic which is being dropped even in
languages using the Ethiopic script?
--Ken
More information about the Idna-update
mailing list