Esszett, Final Sigma, ZWJ and ZWNJ

Vaggelis Segredakis segred at ics.forth.gr
Tue Feb 24 11:41:15 CET 2009


Dear Vint,

 

I would love to say that we as the .gr Registry are enthusiastic about the
proposed solution (PVALID Final Sigma) but in reality we are quite
skeptical. I can clearly see the advantages of the use of a distinct final
sigma. The reality however is that the change is significant and the
registry will have to take measures to reduce the impact.

 

It will be necessary for us (and I believe anyone who uses Esszett as well)
to "map" the two versions of the domain names ourselves to overcome the fact
that browsers and software do not change overnight and IDNA2003 and IDNA2008
are incompatible.

 

In Greek, a word that finishes with a final sigma in small characters when
typed in capital letters gets a normal capital sigma in the place of that
final sigma. Although you have prohibited Capital letters in IDNA2008 any
browser programmer will try to translate letter by letter a URL typed in
capital. Most possibly then he will translate a capital Sigma to sigma and
not final sigma, regardless of its position in the word. Why would a
programmer try to learn Greek grammar?

 

For each final sigma in a domain name, the registrant will have to register
a variant with a lower sigma in that position as well and each variant that
occurs if you put more than one final sigma in a domain name. For 2 final
sigmas you will have 4 variants. If you add to this the tonos punctuation
point issue (in capital letters it is not used and this gives us two
variants for each domain name), you end up with sixteen variants for a
single domain name with two final sigmas (two words)!

 

We already do bundling of the domain names. We will probably do it in the
future, especially if this proposed solution moves forward. If you have any
other alternatives though that could shed some new light on these issues,
this might be a good time to start discussing them. Even if this means a
best practice document or IDNAv2_2009, anything should be open to
discussion.

 

Best Regards,

 

Vaggelis Segredakis

Administrator of the .GR Top Level Domain

Institute of Computer Science

Foundation for Research and Technology - Hellas

Tel. +30-281-0391450

Fax +30-281-0391451

Email segred at ics.forth.gr

 

 

 

 

 

Message: 3

Date: Mon, 23 Feb 2009 20:14:04 -0500

From: Vint Cerf <vint at google.com>

Subject: Re: Esszett, Final Sigma, ZWJ and ZWNJ

To: Mark Davis <mark at macchiato.com>

Cc: Paul Hoffman <phoffman at imc.org>, Andrew Sullivan

            <ajs at shinkuro.com>,    idna-update at alvestrand.no, John C Klensin

            <klensin at jck.com>

Message-ID: <2C4BC1C5-3B45-46FA-AA6D-9A60D3C72B35 at google.com>

Content-Type: text/plain; charset="utf-8"

 

Mark,

 

thanks - I think what left me in an ambiguous state was the term "bits on
the wire".  In your example, under the IDNA2003 mapping process, the final
sigma is mapped into ordinary sigma and THEN the resulting string is looked
up (after conversion to xn-- format using the punycode algorithm). The two
forms become identical prior to lookup.  

Under the proposed IDNA2008 rules, the two strings remain distinct in both
the U-label and A-label format and thus look "different" on the wire and
unless other measures are taken (bundling, restricted registration, etc) it
is possible for the two domains to yield distinct results on lookup.

 

Paul - is that the picture you wanted to paint?

 

sorry to be slow to see which bits you were comparing.

 

v

 

 

Vint Cerf

Google

1818 Library Street, Suite 400

Reston, VA 20190

202-370-5637

vint at google.com

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090224/df51c09e/attachment-0001.htm 


More information about the Idna-update mailing list