Mixing scripts (Re: Unicode versions (Re: Criteria for exceptional characters))

Mark Davis mark.davis at icu-project.org
Fri Dec 22 19:17:25 CET 2006


Ok, I understand now what you mean, thanks.

Mark

On 12/22/06, John C Klensin <klensin at jck.com> wrote:
>
>
>
> --On Friday, 22 December, 2006 09:16 -0800 Mark Davis
> <mark.davis at icu-project.org> wrote:
>
> > What we say in PRI#96 is:
> >> In each of the following contexts, the match to the regular
> >> expressions
> > must also only consist of characters from a single script
> > (after ignoring
> > Common and Inherited Script characters).
> >
> > While it does place limitations on fields containing joiner
> > characters on
> > the basis of script, it doesn't require the mixture of
> > scripts, in the sense
> > used in
> > http://www.unicode.org/reports/tr39/#Mixed_Script_Detection.
>
> I certainly understood that and did not intend to imply
> otherwise.   What I was trying to say is that one of the
> arguments against protocols rules prohibiting mixing of scripts
> is that, with the exception of some bidi issues (which we got at
> least partially wrong), IDNA2003 operates on characters, not
> complete labels.  A mixed-script test requires making the step
> into evaluating complete labels for correctness (under Michael's
> proposal, complete FQDNs).  That is a non-trivial step.  I
> believe that any sensible model for handling ZWJ and ZWNJ
> (including that of PRI#96, which I assume to be the default
> unless better ways are found) will require looking at full
> labels or at least sequences of characters, i.e., making that
> step.
>
>     john
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20061222/812af208/attachment.html


More information about the Idna-update mailing list