standard, stable and unambiguous references Unicode

Mark Davis mark.davis at icu-project.org
Fri Feb 15 03:45:21 CET 2008


Patrik,

Here are the references you can use. Some of the web pages won't be live
yet; they will by the end of March. They are permanent links, once they go
live.

   - toNFC and toNKDC (and isNFC, isNFKC) are defined in *Section 2
   Notation* of *Unicode Standard Annex #15: Unicode Normalization
Forms*by Mark Davis and Martin Dürst, an integral part of The Unicode
Standard,
   Version 5.1.0. (http://www.unicode.org/reports/tr15/tr15-29.html)
   - toCaseFold is defined in *Section 3.13 Default Case Algorithms* of
   The Unicode Standard, Version 5.1.0.

The reference for Unicode 5.1.0 is:

   - The Unicode Consortium. The Unicode Standard, Version 5.1.0, defined
   by: *The Unicode Standard, Version 5.0 *(Boston, MA, Addison-Wesley,
   2007. ISBN 0-321-48091-0) (
   http://www.unicode.org/versions/Unicode5.0.0/), as amended by *Unicode
   5.1.0* (http://www.unicode.org/versions/Unicode5.1.0/).

Note: We've been planning for 5.1 anyway (release in March), and for
references it is important, since it has clarifying text for toCaseFold, and
a number of other areas that should be referenced.

Mark

On Sat, Feb 9, 2008 at 3:49 AM, Patrik Fältström <patrik at frobbit.se> wrote:

> All good comments Erik. Mark, I need to hear from you on the Unicode
> view on this. I have no problems changing according to what Erik
> suggests, as long as I get the "correct" names from you.
>
>    Patrik
>
> On 9 feb 2008, at 03.32, Erik van der Poel wrote:
>
> > Patrik and Mark,
> >
> > I'm reading tables-04 now. I noticed a few things that could be
> > improved, in terms of standard, stable and unambiguous references to
> > Unicode. This is important since IDNA200X is supposed to evolve with
> > Unicode. We need to be able to generate the pvalid/disallowed/etc
> > table every time Unicode releases a new version. So here are a few
> > suggestions and questions:
> >
> > Standard. IDNA200X should use the standard names of Unicode properties
> > and processes, and Unicode should try not to change those names. For
> > example, tables-04 refers to NFKC(...) while Unicode calls that
> > toNFKC(...):
> >
> > http://www.unicode.org/reports/tr15/#Notation
> >
> > There is another function called isNFKC(...), so it would be nice to
> > get the right one (toNFKC).
> >
> > Stable. IDNA200X should use stable references to Unicode documents,
> > and Unicode should make sure those references keep working. For
> > example, the normalization spec mentioned above could be referenced
> > using the stable URI:
> >
> > http://www.unicode.org/reports/tr15/
> >
> > Unambiguous. IDNA200X should use unambiguous names, and Unicode should
> > offer them. For example, tables-04 refers to casefold(...). Unicode
> > has something called Case_Folding(c) that only applies to single
> > characters:
> >
> > http://www.unicode.org/versions/Unicode5.0.0/ch03.pdf
> >
> > Unicode also has something called toCasefolding(x) for strings of
> > characters on page 125 of the above chapter, labelled R4. However, the
> > paragraph above that says that there is a simple and a full variant of
> > that. IDNA200X needs the string function (not the single character
> > function) in the "NFKC(casefold(NFKC(cp)) != cp" construct. I believe
> > IDNA200X also needs the full variant, not the simple variant. But
> > Unicode does not appear to have an unambiguous name for the full
> > variant of the function that works on strings. (Or, if R4 *is* the
> > full variant, the paragraph above it needs tweaking.) In the meantime,
> > IDNA200X can disambiguate it by explicitly saying that
> > toCasefolding(...) refers to the full variant.
> >
> > Yes, this is just nit-picking, but at least we have gotten to the
> > point where we're just tweaking the IDNA200X drafts! We're nearly
> > done. :-)
> >
> > Erik
> > _______________________________________________
> > Idna-update mailing list
> > Idna-update at alvestrand.no
> > http://www.alvestrand.no/mailman/listinfo/idna-update
>
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
>


-- 
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20080214/8533e202/attachment-0001.html


More information about the Idna-update mailing list