<!doctype html public "-//W3C//DTD W3 HTML//EN">
<html><head><style type="text/css"><!--
blockquote, dl, ul, ol, li { padding-top: 0 ; padding-bottom: 0 }
--></style><title>Re: Reserved general
punctuation</title></head><body>
<div>At 2:33 PM -0700 4/30/08, Mark Davis wrote:</div>
<blockquote type="cite" cite>On Wed, Apr 30, 2008 at 1:53 PM, Paul
Hoffman <<a href="mailto:phoffman@imc.org">phoffman@imc.org</a>>
wrote:
<blockquote><a href="http://2.1.3.">2.1.3.</a>
IgnorableProperties (C)<br>
<br>
C: property(cp) is in {Default_Ignorable_Code_Point,
White_Space,<br>
<span
></span
>
Noncharacter_Code_Point}<br>
<br>
This category is used to group codepoints that are not
recommended<br>
for use in identifiers. In general, these codepoints are
not<br>
suitable for use for IDN.<br>
<br>
The definition for Default_Ignorable_Code_Point can be found
in<br>
DerivedCoreProperties.txt [1] (and erratum of 2007-January-25
[2])<br>
and is<br>
<br>
Other_Default_Ignorable_Code_Point + Cf + Cc + Cs<br>
+ Noncharacter_Code_Point + Variation_Selector<br>
- White_Space - FFF9..FFFB (Annotation Characters)<br>
</blockquote>
</blockquote>
<blockquote type="cite" cite><br>
That text has not been updated to U5.1. As I said earlier:<br>
</blockquote>
<blockquote type="cite" cite>"Note that there was a one-time
cleanup of the Default Ignorable Code Point values in Unicode 5.1.0,
specifically to get it into good shape for IDNA (<a
href="http://www.unicode.org/versions/Unicode5.1.0/"
>http://www.unicode.org/versions/Unicode5.1.0/</a> - see
"Rendering Default Ignorable Code Points" and the section
following).</blockquote>
<div><br></div>
<div>I read those sections and don't see how they apply. The first
section is about displaying these characters, which is not what we are
doing: we are allowing or disallowing them. Regardless, the section
doesn't say anything about the cleanup.</div>
<div><br></div>
<blockquote type="cite" cite>This changed the composition, so if
noncharacters are to be DISALLOWED, then they need to be specifically
mentioned.</blockquote>
<div><br></div>
<div>As you can see above, noncharacters are already specifically
mentioned.</div>
<div><br></div>
<blockquote type="cite" cite>Functionally, it doesn't make a lot of
difference, since the Noncharacter_Code_Point values are immutable,
and will always be unassigned (gc=Cn), so they will never be part of
valid labels. But they can be specifically excluded by making
Noncharacter_Code_Point be specifically DISALLOWED, and for
consistency I'd recommend doing that in the tables document. BTW, here
are the code points: <a
href=
"http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B:Noncharacter_Code_Point=True"><span
></span
>http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[:Noncharacter<span
></span>_Code_Point=True</a>:]"</blockquote>
<div><br></div>
<div>And this has *nothing* to do with the codepoints in question,
which are 2064..2069.</div>
<div><br></div>
<blockquote type="cite" cite>Does that help make things any
clearer?</blockquote>
<div><br></div>
<div>No. Let's go back to what started this thread:</div>
<div><br></div>
<div>At 10:20 AM -0700 3/20/08, Mark Davis wrote:</div>
<blockquote type="cite" cite>No, I'm saying the reverse. The way the
05 logic is set up, the table contains the lines I
quoted:</blockquote>
<blockquote type="cite" cite><br></blockquote>
<blockquote type="cite" cite>2064..2069 ; DISALLOWED #
<reserved>..<reserved></blockquote>
<blockquote type="cite" cite><br></blockquote>
<blockquote type="cite" cite>I think it should not; that is, that
those *should* be:</blockquote>
<blockquote type="cite" cite><br></blockquote>
<blockquote type="cite" cite>2064..2069 ; UNASSIGNED #
<reserved>..<reserved></blockquote>
<div><br></div>
<div>Those are not noncharacters, as far as I can tell. Further, you
were asking that they be marked as UNASSIGNED; now it seems like you
want them DISALLOWED. Could you clarify which you actually want?</div>
</body>
</html>