Editorial questions

Mark Davis ☕ mark at macchiato.com
Sun Nov 22 19:09:33 CET 2009


http://unicode.org/Public/UNIDATA/UnicodeData.txt is only part of the
Unicode data; if you only look at that file, you get a very incomplete view
of the Unicode data and properties. It doesn't list reserved characters, so
for example,

F0000;<Plane 15 Private Use, First>;Co;0;L;;;;;N;;;;;
FFFFD;<Plane 15 Private Use, Last>;Co;0;L;;;;;N;;;;;
100000;<Plane 16 Private Use, First>;Co;0;L;;;;;N;;;;;
10FFFD;<Plane 16 Private Use, Last>;Co;0;L;;;;;N;;;;;

You don't see the code points U+FFFFE and U+FFFFF either. The fact that
Unicode goes from 0..10FFFF is in the core specification (see
http://www.unicode.org/versions/Unicode5.2.0/). UTS #44 Unicode Character
Database (http://www.unicode.org/reports/tr44/) should also be referenced,
since it explains the data formats.

As for the titles, I think the more important feature is having either all
or none of them start with "Internationalized Domain Names for Applications
(IDNA):", and having Tables and Bidi in in the titles.

Mark


On Fri, Nov 20, 2009 at 13:16, Harald Alvestrand <harald at alvestrand.no>wrote:

> Mark Davis ☕ wrote:
>
>> I noticed in http://tools.ietf.org/id/draft-ietf-idnabis-tables-07.txtthe following:
>>
>> E01F0..EFFFD; UNASSIGNED  # <reserved>..<reserved>
>>
>> EFFFE..10FFFE; DISALLOWED # <noncharacter>..<noncharacter>
>> It is missing 10FFFF.
>>
>
> The Unicode data table /UnicodeData.txt) doesn't tell us what 10FFFE and
> 10FFFF are; the last character listed is 10FFFD (Plane 16 Private Use,
> Last).
>
> On the other hand, Blocks.txt says:
>
> unicode/Blocks.txt:100000..10FFFF; Supplementary Private Use Area-B
>
> So is there a distinction between 10FFFD, 10FFFE and 10FFFF, and does this
> document need to make that distinction?
>
> (I think it's unreasonable to allow any of them in domain names, so they
> should all be DISALLOWED, but i'm not surprised that there's inconsistencies
> here.)
>
>
>> The titles of the documents are inconsistent: The first three below have a
>> prefix, and the last 3 are missing it. Also, the first three use titlecase
>> (Background, Explanation, and Rationale have B, E, and R capitalized), and
>> the next two don't. It might also be nice to have "BIDI" someplace in
>> "Right-to-left scripts for IDNA", and "Tables" somewhere in "The Unicode
>> code points and IDNA"
>>
> The RFC Editor has strong opinions on where to use titlecase. I always get
> changes when I submit RFCs, so I have left that evaluation on his table.
>
>
>
>>    * Internationalized Domain Names for Applications (IDNA):
>>
>>      Definitions and Document Framework -
>>      http://tools.ietf.org/html/draft-ietf-idnabis-defs
>>    * Internationalized Domain Names in Applications (IDNA): Protocol
>>      - http://tools.ietf.org/html/draft-ietf-idnabis-protocol
>>    * Internationalized Domain Names for Applications (IDNA):
>>      Background, Explanation, and Rationale -
>>      http://tools.ietf.org/html/draft-ietf-idnabis-rationale
>>    * Right-to-left scripts for IDNA -
>>      http://tools.ietf.org/html/draft-ietf-idnabis-bidi
>>    * The Unicode code points and IDNA -
>>      http://tools.ietf.org/html/draft-ietf-idnabis-tables
>>    * Mapping Characters in IDNA -
>>      http://tools.ietf.org/html/draft-ietf-idnabis-mappings
>>
>>
>>
>> Mark
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Idna-update mailing list
>> Idna-update at alvestrand.no
>> http://www.alvestrand.no/mailman/listinfo/idna-update
>>
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20091122/6e5280a2/attachment.htm 


More information about the Idna-update mailing list