It is not quite as simple as you say, because of multiple words. The rules for when to change a sigma C into final sigma are in <a href="http://www.unicode.org/versions/Unicode5.0.0/ch03.pdf#G33992" target="_blank">3.13 Default Case Algorithms
</a>, Table 3-14, p124.<br><br><div style="margin-left: 40px;">C is preceded by a sequence consisting<br>of a cased letter and a case-ignorable<br>sequence, and C is not followed by a<br>sequence consisting of a case ignorable
<br>sequence and then a cased letter.<br></div><br>Because the IDN is in NFC, the above formulation can be simplified by dropping the 'case ignorable sequence' if we are restricted to normal modern Greek.<br><br>
<div style="margin-left: 40px;">
C is preceded by a cased letter, and C is not followed by a cased letter.<br></div><br>On that page also are the exact meaning of a 'cased letter' and 'case-ignorable sequence', derived from standard Unicode properties.
<br><br>If you used those, then in the vast majority of normal Greek text the sigma would be correct. So the following would display correctly<br><br><a href="http://%CF%87%CE%B1%CF%81%CE%B1%CE%BA%CF%84%CE%AE%CF%81%CE%B5%CF%82-%CE%B1%CE%BD%CF%84%CE%B9%CF%83%CF%84%CE%BF%CE%B9%CF%87%CF%8E%CE%BD%CF%84%CE%B1%CF%82.com" target="_blank">
χαρακτήρες-αντιστοιχώντας.com
</a> // 1. display<br><br>As I already noted, IDNs are often not "normal" text because of the use of words run together. So it would fail in that case. For example, the following wouldn't work<br><br><a href="http://%CF%87%CE%B1%CF%81%CE%B1%CE%BA%CF%84%CE%AE%CF%81%CE%B5%CF%83%CE%B1%CE%BD%CF%84%CE%B9%CF%83%CF%84%CE%BF%CE%B9%CF%87%CF%8E%CE%BD%CF%84%CE%B1%CF%82.com" target="_blank">
χαρακτήρεσαντιστοιχώντας.com</a> // display 2<br><a href="http://%CF%87%CE%B1%CF%81%CE%B1%CE%BA%CF%84%CE%AE%CF%81%CE%B5%CF%82%CE%B1%CE%BD%CF%84%CE%B9%CF%83%CF%84%CE%BF%CE%B9%CF%87%CF%8E%CE%BD%CF%84%CE%B1%CF%82.com" target="_blank">
χαρακτήρεςαντιστοιχώντας.com</a> // desired 3<br><br>For IDNA2003, because people do have the choice of an optional hyphen for disambuguation when registering the names, this is probably reasonable as a display step. That is, it will help in many cases, and shouldn't hurt in any. It can be implemented right now in browsers or other user agents -- without any problem -- since the input of the resulting display forms will continue to work (because of StringPrep).
<br><br>For IDNAbis, changing the wire form would introduce compatibility problems, already mentioned.<br><br>Mark<br>
<br><div class="gmail_quote">On Jan 24, 2008 5:33 AM, Gervase Markham <<a href="mailto:gerv@mozilla.org" target="_blank">gerv@mozilla.org</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div>Michael Everson wrote:<br>> I was cheered by what Gervase said about displaying final sigma. I hope<br>> that the Mozilla and Safari and IE get together and agree on a method to<br>> do the right thing for the Greeks.
<br><br></div>I can't bind any of those organisations, even my own, but if it turns<br>out that the final-sigma problem is intractable at the protocol level<br>(and perhaps esszet as well, I don't know), and it's clear that we're
<br>not opening an enormous can of worms which will lead to proliferating<br>special-case code, then if it helps us get this done quicker, the<br>IDN200x standard can punt on the issues and we can attempt to reach a<br>consensus higher up.
<br><br>It seems to me that final sigma's a fairly easy case, as the rules are<br>simple. On lookup, s/<final sigma>/<normal sigma>/. On display,<br>s/<normal sigma at end of label>/<final sigma>/. Other troublesome edge
<br>cases may not be so easy.<br><br>Gerv<br><div><div></div><div>_______________________________________________<br>Idna-update mailing list<br><a href="mailto:Idna-update@alvestrand.no" target="_blank">Idna-update@alvestrand.no
</a><br><a href="http://www.alvestrand.no/mailman/listinfo/idna-update" target="_blank">http://www.alvestrand.no/mailman/listinfo/idna-update</a><br></div></div></blockquote></div><br><br clear="all"><br>-- <br>Mark