<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
<meta name=Generator content="Microsoft Word 11 (filtered medium)">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]-->
<style>
<!--
/* Font Definitions */
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman";}
a:link, span.MsoHyperlink
        {color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {color:blue;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-reply;
        font-family:Arial;
        color:navy;}
@page Section1
        {size:612.0pt 792.0pt;
        margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.Section1
        {page:Section1;}
/* List Definitions */
@list l0
        {mso-list-id:597101261;
        mso-list-template-ids:-33502570;}
ol
        {margin-bottom:0cm;}
ul
        {margin-bottom:0cm;}
-->
</style>
</head>
<body lang=EN-US link=blue vlink=blue>
<div class=Section1>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>I think adding a TRANSITIONAL status may
go someway towards alleviating this problem. Although, as Georg pointed out,
this would mean that the WG would need to reconvene. However, if you add a
TRANSITION DATE field everyone knows where they are at (and when) and no need
to reconvene. You could also add a “Transitional relationship” field which
would include ss for </span></font><font size=2 color=navy face="Courier New"><span
style='font-size:10.0pt;font-family:"Courier New";color:navy'>ß </span></font><font
size=2 color=navy face=Arial><span style='font-size:10.0pt;font-family:Arial;
color:navy'>and add text to the document stating that registries should bundle transitional
characters until the TRANSITION DATE when </span></font><font size=2
color=navy face="Courier New"><span style='font-size:10.0pt;font-family:"Courier New";
color:navy'>ß et al would become PVALID</span></font><font size=2 color=navy
face=Arial><span style='font-size:10.0pt;font-family:Arial;color:navy'>. <o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>Mark wrote:<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>>> </span></font>That will cause
currently valid URLs to fail, but that is far better than having them have
ambiguous targets. This way we get to the long-term goal of having these characters
be PVALID, without having the disruption during the interim.<o:p></o:p></p>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=3 color=navy face="Times New Roman"><span
style='font-size:12.0pt;color:navy'>I don’t like the idea of currently valid
URLs failing. This would be addressed (I think) by bundling until 2016?<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=3 color=navy face="Times New Roman"><span
style='font-size:12.0pt;color:navy'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=3 color=navy face="Times New Roman"><span
style='font-size:12.0pt;color:navy'>Best regards<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=3 color=navy face="Times New Roman"><span
style='font-size:12.0pt;color:navy'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=3 color=navy face="Times New Roman"><span
style='font-size:12.0pt;color:navy'>Debbie</span></font><font size=2
color=navy face=Arial><span style='font-size:10.0pt;font-family:Arial;
color:navy'><o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>
<div>
<div class=MsoNormal align=center style='text-align:center'><font size=3
face="Times New Roman"><span style='font-size:12.0pt'>
<hr size=2 width="100%" align=center tabindex=-1>
</span></font></div>
<p class=MsoNormal><b><font size=2 face=Tahoma><span style='font-size:10.0pt;
font-family:Tahoma;font-weight:bold'>From:</span></font></b><font size=2
face=Tahoma><span style='font-size:10.0pt;font-family:Tahoma'> idna-update-bounces@alvestrand.no
[mailto:idna-update-bounces@alvestrand.no] <b><span style='font-weight:bold'>On
Behalf Of </span></b>Mark Davis ?<br>
<b><span style='font-weight:bold'>Sent:</span></b> 01 December 2009 17:49<br>
<b><span style='font-weight:bold'>To:</span></b> Alexander Mayrhofer<br>
<b><span style='font-weight:bold'>Cc:</span></b> Shawn Steele; Patrik
Fältström; Harald Alvestrand; idna-update@alvestrand.no; lisa Dusseault;
"Martin J. Dürst"; Vint Cerf<br>
<b><span style='font-weight:bold'>Subject:</span></b> Re: The real issue:
interopability, and a proposal (Was: Consensus Call on Latin Sharp S and Greek
Final Sigma)</span></font><o:p></o:p></p>
</div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>I don't think that anyone at this point would really stand in the way
of these characters being PVALID, if it weren't for compatibility problems. To
that end, I think the key issue is the transition strategy: how to deal with
the 5 or so years where the browser implementations are transitioning to
IDNA2008. If we had an adequate strategy, I don't think anyone would really
stand in the way of having the 4 problem characters be valid.<br>
<br>
These 4 characters are unlike symbols in two ways: (a) with symbols you don't
go to two different places with two different browsers, and (b) symbols are far
less frequent than these characters. So even though the prohibition on symbols
was based on no particular evidence, the prohibition doesn't cause a severe
compatibility issue.<br>
<br>
When reading some of the transition proposals, one approach occurred to me.
What if we have a new status for the 4 characters: TRANSITIONAL?<br>
<br>
We set it up in this way; in IDNA2008, TRANSITIONAL characters are invalid for
registration and lookup, AND cannot be mapped. After a period of some years,
once the percentage of IDNA2003 browsers and emailers have dropped to a small
proportion, the stated plan is to issue a new version of IDNA that changes them
to PVALID.<br>
<br>
That will cause currently valid URLs to fail, but that is far better than
having them have ambiguous targets. This way we get to the long-term goal of
having these characters be PVALID, without having the disruption during the
interim.<br>
<br>
===<br>
<br>
As far as Harald's back-of-the-envelope calculations go, they present a very
inaccurate picture of the scale. Here are some more exact figures for that
data.<o:p></o:p></span></font></p>
<ol start=1 type=1>
<li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
mso-list:l0 level1 lfo1'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'>819,600,672 = sample size of
documents<o:p></o:p></span></font></li>
<li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
mso-list:l0 level1 lfo1'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'>5,000 = links with eszed in the
sample<o:p></o:p></span></font></li>
<li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
mso-list:l0 level1 lfo1'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'>1,000,000,000,000 = total
documents in index (2008)<o:p></o:p></span></font></li>
<li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
mso-list:l0 level1 lfo1'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'>1,220 = scaling factor (= total
docs / sample size)<o:p></o:p></span></font></li>
<li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
mso-list:l0 level1 lfo1'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'>6,100,532 = estimated total
links with eszed (= scaling * sample eszed links)<o:p></o:p></span></font></li>
</ol>
<p class=MsoNormal style='margin-bottom:12.0pt'><font size=3
face="Times New Roman"><span style='font-size:12.0pt'>Even this has to be taken
with a certain grain of salt, since (a) it is assuming that the sample is
representative (although we have reasonable confidence in that), and (b) it
doesn't weight the "importance" of the links (in terms of the number
of times they are followed), and (c) this data was collected back in Nov 2008,
so we've had another year of growth since then.<br>
<br clear=all>
Mark<br>
<br>
<o:p></o:p></span></font></p>
<div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>On Tue, Dec 1, 2009 at 01:59, Alexander Mayrhofer <<a
href="mailto:alexander.mayrhofer@nic.at" target="_blank">alexander.mayrhofer@nic.at</a>>
wrote:<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><br>
(I've spent quite some time on re-thinking the issue last night. It's a bit
longish, and the promised proposal is at the end).<br>
<br>
I think i didn't make it clear enough in my previous messages that i'm not an
opponent of the character Latin Sharp S itself. I'm opposing against changes
that have a high risk of introducing interopability, particularly in the long
run.<br>
<br>
My *only* major concern is that the introduction of the Latin Sharp S is
exactly such a case, but a particularly nasty one. I understand that the
majority of WG participants think that "ß" should be PVALID (i'm
carefully avoiding the word "concensus" here, because it's obviously
up to the WG chair to declare that).<br>
<br>
If i look at the issue in an isolated way, not considering any compatibility/interopability
issues, then it makes perfectly sense to declare "ß" PVALID, because
(this is sort of convincing myself here ;) :<br>
<br>
- There seems to be little existing deployment of ß-labels out there, at least
on the web - the client side is a different issue, there's nearly 100%
deployment. We can also err guesstimate that "ß" has only about 1% of
the deployment of other german "umlauts", according to Erik's numbers
(As Eric pointed out, those numbers have no indication of confidence, though).
We don't know how many people type "ß" into their browser address
bar, though, which is at least "unsatisfying" from an engineering
perspective.<br>
<br>
- The character is undoubtly part of German grammar, at least in two of the
three countries where German is an official language - i don't know about the
minorities in other countries. The upper case variant as well as the Unicode
casing and folding is.. well, extravagant - but the lowercase "ß" is
definitely part of the grammar.<br>
<br>
- Georg's argument that this would be "the last chance" to introduce
"ß", got me thinking. If the "Exceptions" would be
implemented as an IANA registry, it would be much easier to add (and probably
remove) characters. But given that changes to the Exceptions now require an
update to the base specification, we should probably take this opportunity,
rather than waiting for IDNA2015.<br>
<br>
So, as i said multiple times, the problem is changing the semantics of a part
of the namespace, definitely from the user's perspective - one could argue
whether or not that means the "protocol semantics" change, since the
mapping step ist part of the protocol of IDNA2003.<br>
<br>
Regarding interopability, i'm not so much concerned about the transition period
between IDNA2003 and IDNAbis. This will be painful, but it will be (hopefully
temporary).<br>
<br>
What i am more concerned is that the legacy of the "ß-ss" mapping
would introduce incompatibility for an indefinite period of time, *after* all
clients have switched over to IDNAbis. This could happen because some vendors
would implement mappings to be fully IDNA2003 backwards compatible, and others
would implements the informative idnabis-mappings only.<br>
<br>
>From a registry point of view, i would very much like to avoid any bundling.
However, the "permanent" interopability issues outlined above are
bound to "taint" labels with an "ß" for an indefinite
period of time, with the most sensible option to disallow registration
completely to avoid those problems.<br>
<br>
I think it's not very likely that all vendors agree on a single mapping - particularly
with the WG scope of not dealing with a mapping as part of the protocol.
However, i'd like to propose the following:<br>
<br>
- add text to Section 5 of idnabis-protocol that says<br>
<br>
"characters that are PVALID MUST NOT be subject
to mappings".<br>
<br>
Or (more focused)<br>
<br>
"characters that are listed as Exceptions (F)
in Section 2.6<br>
of [tables] MUST NOT be subject to mappings"<br>
<br>
I'm not sure whether that contradicts the "local matters" part in
Section 5.1 (and i'm pretty sure it creates problems elsewhere), but i think it
solves the "permanent interopability" problem outlined above. That
means that "ß" stops working during the transition period, but also
means that it can be treated as an independent character *after* the transition
- bundling is not required, Mr Weiss and Mr Weiß can both have their distinct
domain names, etc..<br>
<br>
Is that a way forward? Comments appreciated.<br>
<br>
Alex<o:p></o:p></span></font></p>
<div>
<div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>_______________________________________________<br>
Idna-update mailing list<br>
<a href="mailto:Idna-update@alvestrand.no" target="_blank">Idna-update@alvestrand.no</a><br>
<a href="http://www.alvestrand.no/mailman/listinfo/idna-update" target="_blank">http://www.alvestrand.no/mailman/listinfo/idna-update</a><o:p></o:p></span></font></p>
</div>
</div>
</div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><o:p> </o:p></span></font></p>
</div>
</body>
</html>
<BR>
<P><FONT SIZE=2>Internal Virus Database is out-of-date.<BR>
Checked by AVG.<BR>
Version: 7.5.560 / Virus Database: 270.12.26/2116 - Release Date: 15/05/2009 06:16<BR>
</FONT> </P>