Moving Right Along on the Inclusions Table...

Michael Everson everson at evertype.com
Thu Dec 21 21:07:48 CET 2006


At 11:27 -0800 2006-12-21, Kenneth Whistler wrote:

>  > >I think the argument is identical for Ethiopic ":", and stronger, if
>>  >anything, because that particular bit of punctuation is confusable
>>  >with an important syntax element in URLs.
>>
>>  I can't see how this could cause any actual difficulty. The colon
>>  syntax element occurs in only one position, as in http:// or ftp://
>>  and if someone accidentally put an ETHIOPIC WORDSPACE in there the
>>  only thing that would happen is that the browser wouldn't go anywhere.
>
>You are wrong about this. See, for example:
>
>http://www.adobe.com/cfusion/knowledgebase/index.cfm?id=tn_16715
>
>which points out user confusions that result from the fact
>that the Mac OS X uses ":" as the directory separator.

What has that to do with with URL syntax?

>Allowing a ":" lookalike into the inclusion set for StringPrep 
>(which would not *only* be used for NamePrep and domain names, by 
>the way), is just asking for bad guys to come looking for ways to 
>exploit its visual similarity to ":", especially since both usages 
>would be related to syntactic separation, and
>users would not have any clear way to distinguish the subtleties here.

Can you explain a scenario where it could be exploited in "http://" 
or similar strings? How is this any different from the many sets of 
digits 0123456789 we have in different scripts?

>  > >"-" is the only exceptional bit of punctuation that gets carried
>>  >forward, I think, and it has to be simply because of prior use in
>>  >ASCII-based domain names.
>>
>  > And the question of whether it is right to force that on Ethiopic
>>  which has its own delimiter is one which I think it is legitimate to
>>  ask.
>
>And is it right to force confusion on IDNA for a common syntax 
>element to allow emulation of a word separation convention in 
>Ethiopic which is being dropped even in languages using the Ethiopic 
>script?

That's not an answer to my question, and if there is no script 
mixing, you won't be ABLE to use the Ethiopic character except 
between two OTHER Ethiopic characters. Surely it is possible for 
software to ensure this in a valid identifier.
-- 
Michael Everson * http://www.evertype.com


More information about the Idna-update mailing list