IDNNever.txt

Mark Davis mark.davis at icu-project.org
Mon Feb 12 00:17:56 CET 2007


For us to come to a conclusion about the contents of IDNNever, we first have
to agree on the usage model, otherwise we'll just be talking about different
things. Here are the differences, as far as I can see them.

All of the models provide for a partition of Unicode code points into 4
classes: IDNPermitted, IDNNever, IDNPending, and Unassigned. The key
difference is in the migration path, so we have to consider the migration
implications. Let's take the following scenario.

   1. the registrar is on the version of IDNAbis using Unicode 5.1
   2. the user gets a document containing ...
href="http://aXb.com<http://axb.com/>"...,
   where X is a 5.1 IDNPermitted character, and clicks on the link.
   3. the client software is on the version of IDNAbis using Unicode 5.0

In all of these models, the only property the registrar needs to have is
IDNPermitted. The other classes would only have an impact on the client
side; it depends on the model whether or not distinctions between them are
operationally necessary.

*A. Strict Model*
   IDNPermitted IDNNever IDNPending Unassigned  Allow in client Y N N N  Allow
in registrar Y N N N

This one is the simplest model, but has a significant migration problem. The
user will not be able to get to the desired web page, since the URL will be
refused by the client software. Thus the user will be able to get to the web
page:

   1. If the client is upgraded to 5.1

*B. Minimal Model*
   IDNPermitted IDNNever IDNPending Unassigned  Allow in client Y N Y N  Allow
in registrar Y N N N

This model alleviates the migration somewhat less. The user will be able to
get to the web page:

   1. If the client is upgraded to 5.1, or
   2. If X was assigned in 5.0

*C. Maximal Model*
   IDNPermitted IDNNever IDNPending Unassigned  Allow in client Y N Y Y  Allow
in registrar Y N N N

This model alleviates the migration more. The user will be able to get to
the web page:

   1. If the client is upgraded to 5.1, or
   2. If X was assigned in 5.0, or
   3. If X was unassigned in 5.0, but aXb <http://axb.com/> has the
   correct IDNAbis output form (eg is normalized, lowercase,...)

Thus the user can get to the web page if the page author uses the output
form; if the scenario were the user typing/pasting into the address bar,
then the requirement would be the same: that the typing is normalized and
lowercase. Given the types of characters slated for encoding, it would be
extremely rare that typing would produce an X that doesn't work, but it is
not absolutely guaranteed.

Some people think that not tolerable to have aXb <http://axb.com/> work if
it is in output form on 5.0, but not work if it is in input form; and then
work on 5.1 in either form. I don't quite understand the rationale for that,
so I'll look forward to an exposition.

Mark
On 2/11/07, Patrik Fältström <patrik at frobbit.se> wrote:
>
> I think one can say that from an IETF perspective, the question is
> whether one should allow codepoints in the Unicode table that is of
> one or more of the following General Categories should be allowed in
> the U-label as defined by the document edited by John:
>
> gc ; Sc        ; Currency_Symbol
> gc ; Sk        ; Modifier_Symbol
> gc ; Sm        ; Math_Symbol
> gc ; So        ; Other_Symbol
>
> What John says is that we who look at this problem so far have not
> seen enough evidence that these classes really are needed so that
> they should be included. We are using an inclusion based algorithm
> this time.
>
> What it a special General Category you where thinking of Avri?
>
>     Patrik
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>



-- 
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20070211/b7819ef9/attachment.html


More information about the Idna-update mailing list