Defs: General

John C Klensin klensin at jck.com
Wed Nov 26 23:03:09 CET 2008



--On Wednesday, 19 November, 2008 18:07 -0800 Mark Davis
<mark at macchiato.com> wrote:

> Defs
> *Other than A-Label and U-Label issue already covered:
> *
> ------------------------------
> 
> A
> code point is an integer value associated with a character in
> a coded character set.
> 
> Unicode [Unicode51] is a coded character set containing about
> 100,000 characters as of the current version.
> 
> =>
> 
> A code point is an integer value in the codespace of a coded
> character set. In Unicode, these are integers from 0 to
> 0x10FFFF.
> 
> Unicode [Unicode51] is a coded character set with about
> 100,000 characters assigned to code points as of version 5.1.
> 
> 
> Rationale. Code points may not be associated with characters.
> In Unicode, for example, the vast majority are not, since they
> have not yet been assigned a character. Also added the note
> about the range in Unicode, since its code points are the most
> referenced in these documents.

I have made this change since it was simple, appeared to be, at
worst, harmless,  and no one has raised any objection in the
week since the suggestion was posted (yes, I'm using
approximately that criterion in applying proposed edits in the
current round, but with the understanding that my criteria for
"simple" and "harmless" are fairly restrictive.

However, note, first, that the first of the sentences being
changed was copied unchanged from RFC 3490 and the second one
represents a minor update in the spirit of moving from Unicode
3.2 to 5.x.  Those sentences presumably had community review and
consensus.  While I believe that this sort of change improves
the documents slightly, it is not at all clear to me that the
increased value or clarity is sufficient to justify the
additional delay and discussion.    Second, I observe that
defining "code point" in terms of the even less well-known
"codespace" (not a word in my favorite dictionary and with
different enough meanings in, e.g., Wikipedia to be ambiguous)
borders on the tautological.


 ------------------------------
> 
> these specifications leave the problem of transcoding between
> the
> 
> 
> [action: someplace define the term "transcoding", or better
> yet, just use the term "converting"]

Done

    john



More information about the Idna-update mailing list