vint at google.com
Mon Feb 21 22:47:03 CET 2011
there is a user side to this - and it isn't simple either. In one view, user
expectations about the fitness of a character for use in domain names is
sometimes tied to its linguistic function (e.g. is it a "letter" in the
Unicode sense as opposed to, say, a "symbol"?
Unicode workers have misclassified characters in the past, leading to valid
registrations which might later be considered invalid because the character
has been reclassified in a later version of Unicode. Or we have the cases of
the sharp-S where the addition of a character makes the earlier mapped
practice a surprise because of the (new) introduction of, in this case, a
lower case version of sharp-S.
I am not trying to revisit old debates as much as I am trying to highlight
the continuing difficulty of dealing with reclassifications. Additions of
new characters is the easier case but changes are really troubling. If they
are allowed and registrations permitted and then there are changes in
Unicode giving them properties invalidating their validity for use in domain
names, we have to decide whether to promulgate the previous condition
(allowing what is now under the Unicode properties an invalid character) or
to alter the derived table (from the IDNA2008 rules) and thus make invalid
an earlier registration. If user intuition favors the invalidation (ie it
would not be expected that this would be a valid character for domain names)
then promulgating the previous classification is counter-intuitive. On the
other hand, previously allowed registrations that are invalidated lead to
other kinds of confusion. In the end, the IETF or at least the working group
has treated these cases idiosyncratically because of the ambiguity in
deciding what will be least confusing in some sense.
We could adopt the view that once decided, even if Unicode versions change
properties, we will keep persistent all earlier classifications - that is
how I interpret Mark's position. Others, including me, have some discomfort
with preserving these previous (mis)classifications without exception.
On Mon, Feb 21, 2011 at 4:34 PM, Simon Josefsson <simon at josefsson.org>wrote:
> Patrik Fältström <patrik at frobbit.se> writes:
> > On 21 feb 2011, at 21.58, Simon Josefsson wrote:
> >> I didn't understand John's argument that we have an incompatibility
> >> regardless of what we do
> > Either we have stability in the algorithm (as now proposed) or we have
> > stability in the table which is the result of the calculation of the
> > algorithm.
> > IDNA2008 is defined as being an _algorithm_ that should be stable, so
> > an application can apply it regardless of what version of Unicode we
> > talk about.
> > Because the algorithm is based on property values that now changes for
> > three codepoints, the result of the calculation is not stable.
> > The alternative would be to change the algorithm, and that would make
> > the codepoints not change, but the algorithm changes.
> > IDNA2003 was a table based solution.
> > IDNA2008 is an algorithm based solution.
> I don't follow this -- Mark is not proposing to change the algorithm.
> If I understand him, he proposes to add U+19DA to section G in order to
> make the IDNA2008 algorithm produce stable results independent of
> Unicode version.
> Again, what practical incompatibility is there in following Mark's
> proposal? An illustration would go a long way to convince me here, as I
> believe Mark has illustrated that by _not_ adding another exception to
> the list of exceptions, we will create two incompatible IDNA2008
> algorithms: IDNA2008 with Unicode 5.2/6.0 vs IDNA2008-with-RFC5892bis
> with Unicode 6.0.
> Idna-update mailing list
> Idna-update at alvestrand.no
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Idna-update