Casefolding Sigma (was: Re: IDNAbis Preprocessing Draft)

John C Klensin klensin at jck.com
Tue Jan 22 19:42:22 CET 2008



--On Tuesday, 22 January, 2008 13:12 -0500 Andrew Sullivan
<ajs at crankycanuck.ca> wrote:

> On Tue, Jan 22, 2008 at 06:09:43PM +0100, JFCM wrote:
>> beyond the IETF scope. I am afraid Stephane only says what
>> all the  TLD Managers want.
> 
> For the record, I am not a TLD manager, and I happen to agree
> with Stephane.

Does that imply that you would prefer that Crankycanuck.ca not
match crankycanuck.ca and that CrankyCanuck.ca should be banned
entirely?  At least the first two not matching are a direct
consequence of the statement Stephane made and the third is a
corollary to some of the comments that have been made about the
orthographic and typographic importance of final sigma.

Note that Stephane's note appears to indicate that "the risk of
confusion between google.com and GOOGLE.com" was a "semantic
issue[s]" that ought to be beyond the scope of IETF and any
protocol work.  Is that really what you are agreeing with?   Of
course, if he really intended zeros in the 2nd and 3rd character
positions of the second name, that would be a different
comment... but also a red herring, since no one has proposed
getting the IETF or an IDN protocol into making those
distinctions.  

Of course, the first reading would also imply that
crankycanuck.CA should be resolved in a different zone than
crankycanuck.ca.  

By contrast, final form sigma is not about character confusion
in any way.  It is about:

	(i) Whether final form characters are fundamentally
	different characters than the base forms of the same
	characters?
	
	(ii) Whether it is necessary and desirable to encode
	typographic variations in the DNS for IDNs.   Note that
	a "yes" answer to this question puts one on a slippery
	slope toward needing to encode glyphs and fonts, rather
	than characters.
	
	(iii) What the general rules should be for presentation
	variations of characters that are normally
	position-sensitive and whether Greek final sigma is a
	sufficiently special case that it should be treated
	differently from all other final forms or
	context-sensitive presentation forms more generally.

Because the answers to those questions affect the way strings
are encoded into the DNS and/or how they are matched, I fail to
see how they can be handled at other than a protocol level.

    john




More information about the Idna-update mailing list