noncharacters and unassigned

Mark Davis mark.davis at icu-project.org
Fri Feb 8 19:29:10 CET 2008


Note: the immutability of noncharacters is documented in:

http://www.unicode.org/policies/stability_policy.html#Property_Value

Mark

On Feb 8, 2008 10:19 AM, Mark Davis <mark.davis at icu-project.org> wrote:

> While they have designated functions, they are immutable and will remain
> all Cn forever. So as far as the protocol is concerned, functionally it
> doesn't matter, since these cannot never become protocol-valid.
>
> However, it wouldn't hurt to have a explicit rule putting them in
> Disallowed.
>
> Mark
>
>
> On Feb 8, 2008 9:32 AM, Erik van der Poel <erikv at google.com> wrote:
>
> > The Unicode "private use" characters are already DISALLOWED in
> > IDNA200X. And that's good.
> >
> > I was referring to the Unicode "noncharacters", some of which are
> > UNASSIGNED in IDNA200X, others missing all together.
> >
> > Noncharacters should be DISALLOWED in IDNA200X, since they are not
> > reserved. They have designated functions.
> >
> > Erik
> >
> > On Feb 8, 2008 9:29 AM, Vint Cerf <vint at google.com> wrote:
> > > I am confused. I cannot imagine allowing private use characters in the
> > DNS. Please explain? V
> > >
> > >
> > > ----- Original Message -----
> > > From: idna-update-bounces at alvestrand.no <
> > idna-update-bounces at alvestrand.no>
> > > To: Patrik Fältström <patrik at frobbit.se>
> > > Cc: idna-update at alvestrand.no <idna-update at alvestrand.no>
> > > Sent: Fri Feb 08 09:23:03 2008
> > > Subject: noncharacters and unassigned
> > >
> > > Patrik,
> > >
> > > Regarding the latest tables-04 draft:
> > >
> > >
> > http://www.ietf.org/internet-drafts/draft-faltstrom-idnabis-tables-04.txt
> > >
> > > The code points 10FFFD..10FFFF are missing. I can see absolutely no
> > > reason to omit 10FFFD. It is simply a private use code point, just
> > > like 100000..10FFFC.
> > >
> > > You may have omitted 10FFFE and 10FFFF because they are not listed in
> > > UnicodeData.txt. They are not listed there because they are
> > > "noncharacters". However, you include other noncharacters, namely,
> > > FDD0..FDEF and *FFFE and *FFFF where * is 0..F. But noncharacters are
> > > *not* reserved. They are a kind of super-private use characters that
> > > are not supposed to be interchanged (unlike normal private use, which
> > > may be interchanged).
> > >
> > > Unicode has a number of definitions for terms that start with
> > "unassigned":
> > >
> > > http://www.unicode.org/glossary/#U
> > >
> > > Since their definitions are so confusing, it might be better to use
> > > the term "reserved", which is used a lot in IETF. However, changing
> > > all of the IDNA200X documents to say "reserved" instead of
> > > "unassigned" may be a lot of work, so we could leave it as
> > > "unassigned", as long as we clarify that we are referring to Unicode's
> > > unassigned/undesignated/reserved code point.
> > >
> > > Also, the noncharacters should be DISALLOWED, not UNASSIGNED.
> > >
> > > Erik
> > > _______________________________________________
> > > Idna-update mailing list
> > > Idna-update at alvestrand.no
> > > http://www.alvestrand.no/mailman/listinfo/idna-update
> > >
> > _______________________________________________
> > Idna-update mailing list
> > Idna-update at alvestrand.no
> > http://www.alvestrand.no/mailman/listinfo/idna-update
> >
>
>
>
> --
> Mark




-- 
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20080208/414ad6ba/attachment.html


More information about the Idna-update mailing list