noncharacters and unassigned

Mark Davis mark.davis at icu-project.org
Fri Feb 8 19:19:56 CET 2008


While they have designated functions, they are immutable and will remain all
Cn forever. So as far as the protocol is concerned, functionally it doesn't
matter, since these cannot never become protocol-valid.

However, it wouldn't hurt to have a explicit rule putting them in
Disallowed.

Mark

On Feb 8, 2008 9:32 AM, Erik van der Poel <erikv at google.com> wrote:

> The Unicode "private use" characters are already DISALLOWED in
> IDNA200X. And that's good.
>
> I was referring to the Unicode "noncharacters", some of which are
> UNASSIGNED in IDNA200X, others missing all together.
>
> Noncharacters should be DISALLOWED in IDNA200X, since they are not
> reserved. They have designated functions.
>
> Erik
>
> On Feb 8, 2008 9:29 AM, Vint Cerf <vint at google.com> wrote:
> > I am confused. I cannot imagine allowing private use characters in the
> DNS. Please explain? V
> >
> >
> > ----- Original Message -----
> > From: idna-update-bounces at alvestrand.no <
> idna-update-bounces at alvestrand.no>
> > To: Patrik Fältström <patrik at frobbit.se>
> > Cc: idna-update at alvestrand.no <idna-update at alvestrand.no>
> > Sent: Fri Feb 08 09:23:03 2008
> > Subject: noncharacters and unassigned
> >
> > Patrik,
> >
> > Regarding the latest tables-04 draft:
> >
> >
> http://www.ietf.org/internet-drafts/draft-faltstrom-idnabis-tables-04.txt
> >
> > The code points 10FFFD..10FFFF are missing. I can see absolutely no
> > reason to omit 10FFFD. It is simply a private use code point, just
> > like 100000..10FFFC.
> >
> > You may have omitted 10FFFE and 10FFFF because they are not listed in
> > UnicodeData.txt. They are not listed there because they are
> > "noncharacters". However, you include other noncharacters, namely,
> > FDD0..FDEF and *FFFE and *FFFF where * is 0..F. But noncharacters are
> > *not* reserved. They are a kind of super-private use characters that
> > are not supposed to be interchanged (unlike normal private use, which
> > may be interchanged).
> >
> > Unicode has a number of definitions for terms that start with
> "unassigned":
> >
> > http://www.unicode.org/glossary/#U
> >
> > Since their definitions are so confusing, it might be better to use
> > the term "reserved", which is used a lot in IETF. However, changing
> > all of the IDNA200X documents to say "reserved" instead of
> > "unassigned" may be a lot of work, so we could leave it as
> > "unassigned", as long as we clarify that we are referring to Unicode's
> > unassigned/undesignated/reserved code point.
> >
> > Also, the noncharacters should be DISALLOWED, not UNASSIGNED.
> >
> > Erik
> > _______________________________________________
> > Idna-update mailing list
> > Idna-update at alvestrand.no
> > http://www.alvestrand.no/mailman/listinfo/idna-update
> >
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>



-- 
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20080208/e2bd3522/attachment-0001.html


More information about the Idna-update mailing list