standard, stable and unambiguous references Unicode

Patrik Fältström patrik at frobbit.se
Sat Feb 9 12:49:05 CET 2008


All good comments Erik. Mark, I need to hear from you on the Unicode  
view on this. I have no problems changing according to what Erik  
suggests, as long as I get the "correct" names from you.

    Patrik

On 9 feb 2008, at 03.32, Erik van der Poel wrote:

> Patrik and Mark,
>
> I'm reading tables-04 now. I noticed a few things that could be
> improved, in terms of standard, stable and unambiguous references to
> Unicode. This is important since IDNA200X is supposed to evolve with
> Unicode. We need to be able to generate the pvalid/disallowed/etc
> table every time Unicode releases a new version. So here are a few
> suggestions and questions:
>
> Standard. IDNA200X should use the standard names of Unicode properties
> and processes, and Unicode should try not to change those names. For
> example, tables-04 refers to NFKC(...) while Unicode calls that
> toNFKC(...):
>
> http://www.unicode.org/reports/tr15/#Notation
>
> There is another function called isNFKC(...), so it would be nice to
> get the right one (toNFKC).
>
> Stable. IDNA200X should use stable references to Unicode documents,
> and Unicode should make sure those references keep working. For
> example, the normalization spec mentioned above could be referenced
> using the stable URI:
>
> http://www.unicode.org/reports/tr15/
>
> Unambiguous. IDNA200X should use unambiguous names, and Unicode should
> offer them. For example, tables-04 refers to casefold(...). Unicode
> has something called Case_Folding(c) that only applies to single
> characters:
>
> http://www.unicode.org/versions/Unicode5.0.0/ch03.pdf
>
> Unicode also has something called toCasefolding(x) for strings of
> characters on page 125 of the above chapter, labelled R4. However, the
> paragraph above that says that there is a simple and a full variant of
> that. IDNA200X needs the string function (not the single character
> function) in the "NFKC(casefold(NFKC(cp)) != cp" construct. I believe
> IDNA200X also needs the full variant, not the simple variant. But
> Unicode does not appear to have an unambiguous name for the full
> variant of the function that works on strings. (Or, if R4 *is* the
> full variant, the paragraph above it needs tweaking.) In the meantime,
> IDNA200X can disambiguate it by explicitly saying that
> toCasefolding(...) refers to the full variant.
>
> Yes, this is just nit-picking, but at least we have gotten to the
> point where we're just tweaking the IDNA200X drafts! We're nearly
> done. :-)
>
> Erik
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://www.alvestrand.no/pipermail/idna-update/attachments/20080209/0d5b70bd/PGP.bin


More information about the Idna-update mailing list