Final Sigma (was: RE: Esszett, Final Sigma, ZWJ and ZWNJ)

Mark Davis mark at macchiato.com
Wed Feb 25 18:29:40 CET 2009


For the final sigma, I'd like to get a bit clearer on what your position is.
I see the following alternatives.

   1. Map ς and Σ to σ *before* converting *to* punycode (= IDNA2003)
   2. Option 1, plus map final σ to ς *after* converting *from* punycode.
   3. Map Σ to σ before converting to punycode; leave ς alone (= current
   IDNA2008 draft).

Which of these do you think would be best, and why? (Or do you have another
suggested alternative?)

For clarity, I think we need to separate out the accent issue, so I'll
change the subject for that part.

Mark


On Wed, Feb 25, 2009 at 03:02, Vaggelis Segredakis <segred at ics.forth.gr>wrote:

>  Dear Mark and Tina,
>
>
>
> The original IDNA2003 mapping has made life easier for us on the final
> sigma -> sigma issue but the example Mark presented brings forth another
> very big problem we have faced with that version: In Greek you never put a
> hyphenation mark in a word consisting only by capital letters. The correct
> uppercase for χρήσης.gr (xn--jxas2ajbt.gr) is ΧΡΗΣΗΣ.gr (xn--sxaa2ajbt.gr)
> and not ΧΡΉΣΗΣ.gr which was accepted by IDNA2003 as the only equivalent.
>
>
>
> We started there and then to use bundling options to bundle DNS tags to
> make them work as our language is normally used where it should have been
> the other way round. IDNA tags should be able to represent languages as they
> are used. It happens in Latin character languages.
>
>
>
> I would welcome a solution that takes this second issue into account as
> well and further simplifies life for Greek users who get a poor experience
> of the IDNs. We had already a meeting with our Telecommunications regulator,
> our Government and the .CY registry and we tried to raise a common position
> on this new solution of the final sigma representation as a separate
> character. The results of this meeting are pending but from my understanding
> a more global solution on these issues that haunt the Greek IDNs would be
> more welcome than patches on a problematic protocol.
>
>
>
> My belief is that if a broader solution would be welcomed by this working
> group, our LIC would be interested to participate in a broad public
> discussion for a consensus in how we wish our IDNs to operate. The question
> is if this WG is ready to bend some rules and change some former decisions
> because it looks that xn— might be a thing of the past soon.
>
>
>
> Vaggelis
>
>
>  ------------------------------
>
> *From:* mark.edward.davis at gmail.com [mailto:mark.edward.davis at gmail.com] *On
> Behalf Of *Mark Davis
> *Sent:* Wednesday, February 25, 2009 1:18 AM
> *To:* Tina Dam
> *Cc:* Vaggelis Segredakis; idna-update at alvestrand.no; Vint Cerf; Sotiris
> Panaretou; Panagiotis Papaspiliopoulos; Euripides Zervanos
> *Subject:* Re: Final Sigma (was: RE: Esszett, Final Sigma, ZWJ and ZWNJ)
>
>
>
> The original IDNA2003 mapping was chosen for a purpose: it allows
> χρήσης.gr <http://xn--jxas2ajbt.gr> and ΧΡΉΣΗΣ.gr to both go to the same
> page, without requiring bundling. (Note the two different kinds of lowercase
> sigmas.)
>
> I still think a better approach would be to retain the mapping for
> compatibility, but specify that when converting back from punycode, trailing
> sigmas be transformed into final sigmas. For example, in the address bar you
> could type ΧΡΉΣΗΣ.gr, and when you went to the page you'd see χρήσης.gr<http://xn--jxas2ajbt.gr>in the address bar.
>
> The only downside I can see is that it would encourage Greek domain names
> to use interior hyphens where necessary to get the sigma right. So you would
> want to register
>
> ευρείας-χρήσης.gr <http://xn----tlbbisas8eesdbp8a.gr>
>   instead of
> ευρείασχρήσης.gr <http://xn--jxas2ajbt.gr>
>
> But that's not a big downside compared with the alternatives.
>
> Mark
>
>  On Tue, Feb 24, 2009 at 14:34, Tina Dam <tina.dam at icann.org> wrote:
>
> Vaggelis,
>
> I totally understand the frustration and concern that you are expressing. I
> am wondering though if it is not better to get this corrected now, so that
> the Greek script/language is functioning correctly in the Internet/with
> domain names, than it is to have this half solution that really makes things
> worse the larger the volume of domain names that are registered? That is
> both under .GR, but also other TLDs that might introduce the Greek
> characters (.CY is the most natural existing TLD that comes to mind in
> addition to .GR, but off course also gTLDs, and even more importantly as we
> move to the IDN TLDs).
>
>
>
> As far as I see things this is not a matter of mapping or no mappings, but
> in the case about the final sigma it is the matter of a wrong decision being
> made in 2003, making
>
>
>
> U+03A3 GREEK CAPITAL LETTER SIGMA - always map into:
>
>
>
> U+03C3 GREEK SMALL LETTER SIGMA - when in fact (as you and your colleagues
> are well aware of and as you express below) it often should be mapped into:
>
>
>
> U+03C2 GREEK SMALL LETTER FINAL SIGMA
>
>
>
> In other words, the mapping of the Capital Sigma is not a one-to-one nor a
> global solution like for example the mapping of Capital “A” to lower-case
> “a” is, and hence this sigma-mapping should never have been introduced in
> the protocol in the first place.
>
>
>
> About solutions….I am wondering if you are going to be at the Mexico
> meeting this following week and if so, perhaps we can find a good time to
> chat further about it? (That would be with my IDN hat on and ICANN hat of,
> since ICANN off course has nothing to do with your policies).
>
>
>
> Tina
>
>
>
>
>
>
>
> *From:* idna-update-bounces at alvestrand.no [mailto:
> idna-update-bounces at alvestrand.no] *On Behalf Of *Vaggelis Segredakis
> *Sent:* Tuesday, February 24, 2009 2:41 AM
> *To:* idna-update at alvestrand.no; 'Vint Cerf'
> *Cc:* 'Euripides Zervanos'; 'Panagiotis Papaspiliopoulos'; 'Sotiris
> Panaretou'
> *Subject:* Re: Esszett, Final Sigma, ZWJ and ZWNJ
>
>
>
> Dear Vint,
>
>
>
> I would love to say that we as the .gr Registry are enthusiastic about the
> proposed solution (PVALID Final Sigma) but in reality we are quite
> skeptical. I can clearly see the advantages of the use of a distinct final
> sigma. The reality however is that the change is significant and the
> registry will have to take measures to reduce the impact.
>
>
>
> It will be necessary for us (and I believe anyone who uses Esszett as well)
> to “map” the two versions of the domain names ourselves to overcome the fact
> that browsers and software do not change overnight and IDNA2003 and IDNA2008
> are incompatible.
>
>
>
> In Greek, a word that finishes with a final sigma in small characters when
> typed in capital letters gets a normal capital sigma in the place of that
> final sigma. Although you have prohibited Capital letters in IDNA2008 any
> browser programmer will try to translate letter by letter a URL typed in
> capital. Most possibly then he will translate a capital Sigma to sigma and
> not final sigma, regardless of its position in the word. Why would a
> programmer try to learn Greek grammar?
>
>
>
> For each final sigma in a domain name, the registrant will have to register
> a variant with a lower sigma in that position as well and each variant that
> occurs if you put more than one final sigma in a domain name. For 2 final
> sigmas you will have 4 variants. If you add to this the tonos punctuation
> point issue (in capital letters it is not used and this gives us two
> variants for each domain name), you end up with sixteen variants for a
> single domain name with two final sigmas (two words)!
>
>
>
> We already do bundling of the domain names. We will probably do it in the
> future, especially if this proposed solution moves forward. If you have any
> other alternatives though that could shed some new light on these issues,
> this might be a good time to start discussing them. Even if this means a
> best practice document or IDNAv2_2009, anything should be open to
> discussion.
>
>
>
> Best Regards,
>
>
>
> Vaggelis Segredakis
>
> Administrator of the .GR Top Level Domain
>
> Institute of Computer Science
>
> Foundation for Research and Technology - Hellas
>
> Tel. +30-281-0391450
>
> Fax +30-281-0391451
>
> Email segred at ics.forth.gr
>
>
>
>
>
>
>
>
>
>
>
> Message: 3
>
> Date: Mon, 23 Feb 2009 20:14:04 -0500
>
> From: Vint Cerf <vint at google.com>
>
> Subject: Re: Esszett, Final Sigma, ZWJ and ZWNJ
>
> To: Mark Davis <mark at macchiato.com>
>
> Cc: Paul Hoffman <phoffman at imc.org>, Andrew Sullivan
>
>             <ajs at shinkuro.com>,    idna-update at alvestrand.no, John C
> Klensin
>
>             <klensin at jck.com>
>
> Message-ID: <2C4BC1C5-3B45-46FA-AA6D-9A60D3C72B35 at google.com>
>
> Content-Type: text/plain; charset="utf-8"
>
>
>
> Mark,
>
>
>
> thanks - I think what left me in an ambiguous state was the term "bits on
> the wire".  In your example, under the IDNA2003 mapping process, the final
> sigma is mapped into ordinary sigma and THEN the resulting string is looked
> up (after conversion to xn-- format using the punycode algorithm). The two
> forms become identical prior to lookup.
>
> Under the proposed IDNA2008 rules, the two strings remain distinct in both
> the U-label and A-label format and thus look "different" on the wire and
> unless other measures are taken (bundling, restricted registration, etc) it
> is possible for the two domains to yield distinct results on lookup.
>
>
>
> Paul - is that the picture you wanted to paint?
>
>
>
> sorry to be slow to see which bits you were comparing.
>
>
>
> v
>
>
>
>
>
> Vint Cerf
>
> Google
>
> 1818 Library Street, Suite 400
>
> Reston, VA 20190
>
> 202-370-5637
>
> vint at google.com
>
>
>
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
>
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090225/2cb5ab5a/attachment-0001.htm 


More information about the Idna-update mailing list