<HTML>

<HEAD>

<TITLE>Re: Editorial questions</TITLE>

</HEAD>

<BODY>

<FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'><BR>

Den 2009-11-20 22.16, skrev &quot;Harald Alvestrand&quot; &lt;<a href="harald@alvestrand.no">harald@alvestrand.no</a>&gt;:<BR>

<BR>

<FONT COLOR="#0000FF">&gt; So is there a distinction between 10FFFD, 10FFFE and 10FFFF, and does this <BR>

&gt; document need to make that distinction?<BR>

&gt; (I think it's unreasonable to allow any of them in domain names,<BR>

&gt; so they should all be DISALLOWED, but i'm not surprised that there's<BR>

&gt; inconsistencies here.)<BR>

</FONT><BR>

I wouldn't use the term &quot;inconsistencies&quot;... For administrative reasons<BR>

(file data compatibility) these aren't given in UnicodeData.txt. See PropList.txt, which says:<BR>

<BR>

FDD0..FDEF &nbsp;&nbsp;&nbsp;; Noncharacter_Code_Point # Cn &nbsp;[32] &lt;noncharacter-FDD0&gt;..&lt;noncharacter-FDEF&gt;<BR>

FFFE..FFFF &nbsp;&nbsp;&nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-FFFE&gt;..&lt;noncharacter-FFFF&gt;<BR>

1FFFE..1FFFF &nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-1FFFE&gt;..&lt;noncharacter-1FFFF&gt;<BR>

2FFFE..2FFFF &nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-2FFFE&gt;..&lt;noncharacter-2FFFF&gt;<BR>

3FFFE..3FFFF &nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-3FFFE&gt;..&lt;noncharacter-3FFFF&gt;<BR>

4FFFE..4FFFF &nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-4FFFE&gt;..&lt;noncharacter-4FFFF&gt;<BR>

5FFFE..5FFFF &nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-5FFFE&gt;..&lt;noncharacter-5FFFF&gt;<BR>

6FFFE..6FFFF &nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-6FFFE&gt;..&lt;noncharacter-6FFFF&gt;<BR>

7FFFE..7FFFF &nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-7FFFE&gt;..&lt;noncharacter-7FFFF&gt;<BR>

8FFFE..8FFFF &nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-8FFFE&gt;..&lt;noncharacter-8FFFF&gt;<BR>

9FFFE..9FFFF &nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-9FFFE&gt;..&lt;noncharacter-9FFFF&gt;<BR>

AFFFE..AFFFF &nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-AFFFE&gt;..&lt;noncharacter-AFFFF&gt;<BR>

BFFFE..BFFFF &nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-BFFFE&gt;..&lt;noncharacter-BFFFF&gt;<BR>

CFFFE..CFFFF &nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-CFFFE&gt;..&lt;noncharacter-CFFFF&gt;<BR>

DFFFE..DFFFF &nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-DFFFE&gt;..&lt;noncharacter-DFFFF&gt;<BR>

EFFFE..EFFFF &nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-EFFFE&gt;..&lt;noncharacter-EFFFF&gt;<BR>

FFFFE..FFFFF &nbsp;; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-FFFFE&gt;..&lt;noncharacter-FFFFF&gt;<BR>

10FFFE..10FFFF; Noncharacter_Code_Point # Cn &nbsp;&nbsp;[2] &lt;noncharacter-10FFFE&gt;..&lt;noncharacter-10FFFF&gt;<BR>

<BR>

All of the *FE and *FF ones have been &quot;permanently reserved&quot; (a.k.a. non-character code points)<BR>

since the initial synchronisation with ISO/IEC 10646 (1993). I cannot recall the exact reason, but<BR>

for FFFE it had (and still has) to do with byte-order mark and its representation in UCS-2/UTF-16.<BR>

The FDD0-FDEF ones were reserved later, since one wanted more non-character code points, for<BR>

internal processing reasons.<BR>

<BR>

<a href="http://tools.ietf.org/id/draft-ietf-idnabis-tables-07.txt">http://tools.ietf.org/id/draft-ietf-idnabis-tables-07.txt</a> covers all of the above non-characters<BR>

as DISALLOWED except for U+10FFFF which somehow has been missed out. Note though<BR>

that the title of Appendix B does not miss out U+10FFFF...<BR>

<BR>

&nbsp;&nbsp;&nbsp;&nbsp;/kent k<BR>

</SPAN></FONT>

</BODY>

</HTML>