standard, stable and unambiguous references Unicode

Mark Davis mark.davis at icu-project.org
Mon Apr 7 23:18:01 CEST 2008


Unicode 5.1 has been released. I tried to forward the announcement to this
list, but it got bounced for some mysterious reason (probably the Norwegian
Homeland Security Department ;-)

Anyway, the announcement is at
http://www.unicode.org/press/pr-5.1.html(hope mere mention of it
doesn't bounce this message).

Mark

On Sat, Feb 16, 2008 at 3:05 PM, Patrik Fältström <patrik at frobbit.se> wrote:

>
> On 15 feb 2008, at 03.45, Mark Davis wrote:
>
>  Patrik,
> >
> > Here are the references you can use. Some of the web pages won't be live
> > yet; they will by the end of March. They are permanent links, once they
> > go
> > live.
> >
>
> Thanks. Can you let me know when these are live?
>
>   Patrik
>
>
> >
> >  - toNFC and toNKDC (and isNFC, isNFKC) are defined in *Section 2
> >
> >  Notation* of *Unicode Standard Annex #15: Unicode Normalization
> > Forms*by Mark Davis and Martin Dürst, an integral part of The Unicode
> > Standard,
> >  Version 5.1.0. (http://www.unicode.org/reports/tr15/tr15-29.html)
> >  - toCaseFold is defined in *Section 3.13 Default Case Algorithms* of
> >  The Unicode Standard, Version 5.1.0.
> >
> > The reference for Unicode 5.1.0 is:
> >
> >  - The Unicode Consortium. The Unicode Standard, Version 5.1.0, defined
> >  by: *The Unicode Standard, Version 5.0 *(Boston, MA, Addison-Wesley,
> >  2007. ISBN 0-321-48091-0) (
> >  http://www.unicode.org/versions/Unicode5.0.0/), as amended by *Unicode
> >  5.1.0* (http://www.unicode.org/versions/Unicode5.1.0/).
> >
> > Note: We've been planning for 5.1 anyway (release in March), and for
> > references it is important, since it has clarifying text for toCaseFold,
> > and
> > a number of other areas that should be referenced.
> >
> > Mark
> >
> > On Sat, Feb 9, 2008 at 3:49 AM, Patrik Fältström <patrik at frobbit.se>
> > wrote:
> >
> >  All good comments Erik. Mark, I need to hear from you on the Unicode
> > > view on this. I have no problems changing according to what Erik
> > > suggests, as long as I get the "correct" names from you.
> > >
> > >  Patrik
> > >
> > > On 9 feb 2008, at 03.32, Erik van der Poel wrote:
> > >
> > >  Patrik and Mark,
> > > >
> > > > I'm reading tables-04 now. I noticed a few things that could be
> > > > improved, in terms of standard, stable and unambiguous references to
> > > > Unicode. This is important since IDNA200X is supposed to evolve with
> > > > Unicode. We need to be able to generate the pvalid/disallowed/etc
> > > > table every time Unicode releases a new version. So here are a few
> > > > suggestions and questions:
> > > >
> > > > Standard. IDNA200X should use the standard names of Unicode
> > > > properties
> > > > and processes, and Unicode should try not to change those names. For
> > > > example, tables-04 refers to NFKC(...) while Unicode calls that
> > > > toNFKC(...):
> > > >
> > > > http://www.unicode.org/reports/tr15/#Notation
> > > >
> > > > There is another function called isNFKC(...), so it would be nice to
> > > > get the right one (toNFKC).
> > > >
> > > > Stable. IDNA200X should use stable references to Unicode documents,
> > > > and Unicode should make sure those references keep working. For
> > > > example, the normalization spec mentioned above could be referenced
> > > > using the stable URI:
> > > >
> > > > http://www.unicode.org/reports/tr15/
> > > >
> > > > Unambiguous. IDNA200X should use unambiguous names, and Unicode
> > > > should
> > > > offer them. For example, tables-04 refers to casefold(...). Unicode
> > > > has something called Case_Folding(c) that only applies to single
> > > > characters:
> > > >
> > > > http://www.unicode.org/versions/Unicode5.0.0/ch03.pdf
> > > >
> > > > Unicode also has something called toCasefolding(x) for strings of
> > > > characters on page 125 of the above chapter, labelled R4. However,
> > > > the
> > > > paragraph above that says that there is a simple and a full variant
> > > > of
> > > > that. IDNA200X needs the string function (not the single character
> > > > function) in the "NFKC(casefold(NFKC(cp)) != cp" construct. I
> > > > believe
> > > > IDNA200X also needs the full variant, not the simple variant. But
> > > > Unicode does not appear to have an unambiguous name for the full
> > > > variant of the function that works on strings. (Or, if R4 *is* the
> > > > full variant, the paragraph above it needs tweaking.) In the
> > > > meantime,
> > > > IDNA200X can disambiguate it by explicitly saying that
> > > > toCasefolding(...) refers to the full variant.
> > > >
> > > > Yes, this is just nit-picking, but at least we have gotten to the
> > > > point where we're just tweaking the IDNA200X drafts! We're nearly
> > > > done. :-)
> > > >
> > > > Erik
> > > > _______________________________________________
> > > > Idna-update mailing list
> > > > Idna-update at alvestrand.no
> > > > http://www.alvestrand.no/mailman/listinfo/idna-update
> > > >
> > >
> > >
> > > _______________________________________________
> > > Idna-update mailing list
> > > Idna-update at alvestrand.no
> > > http://www.alvestrand.no/mailman/listinfo/idna-update
> > >
> > >
> > >
> >
> > --
> > Mark
> >
>
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
>


-- 
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20080407/fef2869e/attachment.html


More information about the Idna-update mailing list