Tonus (was: Re: Casefolding Sigma (was: Re:
IDNAbis Preprocessing Draft))
Vaggelis Segredakis
segred at ics.forth.gr
Fri Feb 1 18:34:54 CET 2008
Patrik, Vint,
Thank you both for your attempt to clarify this issue.
Let me present you with some questions to help me clarify it further:
Zone file first: (requested example name βαγγέλης.gr -> registered as βαγγέλησ.gr equivalent to xn--ixahcfaz1a9d.gr)
*IDNA2003: we have xn--ixahcfaz1a9d.gr IN NS....
*IDNA200X: we still put xn--ixahcfaz1a9d.gr IN NS...?
Browser line:
IDNA2003:
*We start by typing xn--ixahcfaz1a9d.gr - it gets translated as βαγγέλησ.gr
*We start by typing βαγγέλης.gr (we are allowed to do that) - it gets translated as xn--ixahcfaz1a9d.gr
IDNA200X:
*We start by typing xn--ixahcfaz1a9d.gr - it gets translated as βαγγέλησ.gr
*We start by typing βαγγέλης.gr. Are we allowed to do that? Is it corresponding to any PUNYCODE translation? Which domain name will be sent to the resolver?
My main concern is the user experience. Can the user type in βαγγέλης.gr in the *browser* line or *email client* and still get xn--ixahcfaz1a9d.gr? Then it should be OK. Will the browser/email-client still allow Upper case characters in IDN as well?
Kind Regards,
Vaggelis
-----Original Message-----
From: Patrik Fältström [mailto:patrik at frobbit.se]
Sent: Thursday, January 31, 2008 1:27 PM
To: Vaggelis Segredakis
Cc: 'Harald Alvestrand'; 'John C Klensin'; idna-update at alvestrand.no
Subject: Re: Tonus (was: Re: Casefolding Sigma (was: Re: IDNAbis Preprocessing Draft))
On 31 jan 2008, at 10.55, Vaggelis Segredakis wrote:
> Should I start spelling my name as Βαγγέλησ instead of
> Βαγγέλης which is the
> correct spelling because some people had problems designing a
> multilingual
> protocol for computers?
The Unicode Consortium has decided that at time of matching, there is
an equivalence between the codepoints U+03C2 (GREEK SMALL LETTER FINAL
SIGMA) and U+03C3 (GREEK SMALL LETTER SIGMA). This implies only one of
these codepoints can be stored in the DNS. Further, the casefolding
algorithm provided together with normalization state U+03C3 is the
stable codepoint of the two, and because of that U+03C2 can not be
stored in the core of a database for matchings.
The way IDNA2003 is designed, both U+03C2 and U+03C3 are mapped to U
+03C3, which implies either of the two codepoints will match with U
+03C3 that is stored in the DNS. And, because of this, you can
according to IDNA2003 include U+03C2 in a domain name that you ask
people to use (although U+03C3 is stored in the DNS).
IDNA200x is, for exactly the reasons this discussion exists --
confusion, only talking about what can be stored in the DNS, and that
is U+03C3 in both IDNA2003 and IDNA200x, all according to the design
of the Unicode Character Set.
So, your issues have nothing to do with IDN and implementation of IDN,
but design of the Unicode Character Set and I because of that ask you
to direct your issues to the Unicode Consortium.
Patrik
-----Original Message 2-----
From: Vint Cerf [mailto:vint at google.com]
Sent: Thursday, January 31, 2008 2:38 PM
To: Vaggelis Segredakis
Cc: 'Harald Alvestrand'; 'John C Klensin'; patrik at frobbit.se; idna-update at alvestrand.no
Subject: Re: Tonus (was: Re: Casefolding Sigma (was: Re: IDNAbis Preprocessing Draft))
inputting them in the browser is fine. They get casefolded and
normalized for dns lookup. your problem is that the IDN design, for
valid reasons related to Unicode normalization, has not been able to
preserve the original strings but only the normalized ones.
vint
More information about the Idna-update
mailing list