IDN_Never and IDN_Always

Mark Davis mark.davis at icu-project.org
Sat Dec 22 02:43:53 CET 2007


I also want to note that some of the historic casing changes that Ken
mentioned were precisely to allow for compatibility. The technical committee
did a thorough review of all the characters that had uppercase forms but not
lower case, to see which could possibly need a lowercase form added in the
future. Those were then added, so that casing could be stabilized. The
consortium and ISO/IEC SC2/WG2 committed to adding complete case pairs where
necessary in the future to preserve that stability.

Mark

On Dec 21, 2007 5:36 PM, Kenneth Whistler <kenw at sybase.com> wrote:

> Patrik,
>
> To perhaps allay some of the concerns you raised on the 16th
> about property stability issues, I have run the table
> derivation suggested in the contribution "Table Derivation"
> posted earlier today by Mark and myself, against the
> public data files posted for all versions of Unicode
> from Version 3.2 to Unicode 5.0 (and the beta data files
> currently public for the as-yet-unreleased Unicode 5.1),
> and pushed up the set of resulting derived property definition
> files to my public directory on the Unicode server:
>
> http://www.unicode.org/~whistler/idna/IDN_Always-3.2.0.txt<http://www.unicode.org/%7Ewhistler/idna/IDN_Always-3.2.0.txt>
> http://www.unicode.org/~whistler/idna/IDN_Always-4.0.0.txt<http://www.unicode.org/%7Ewhistler/idna/IDN_Always-4.0.0.txt>
> http://www.unicode.org/~whistler/idna/IDN_Always-4.0.1.txt<http://www.unicode.org/%7Ewhistler/idna/IDN_Always-4.0.1.txt>
> http://www.unicode.org/~whistler/idna/IDN_Always-4.1.0.txt<http://www.unicode.org/%7Ewhistler/idna/IDN_Always-4.1.0.txt>
> http://www.unicode.org/~whistler/idna/IDN_Always-5.0.0.txt<http://www.unicode.org/%7Ewhistler/idna/IDN_Always-5.0.0.txt>
> http://www.unicode.org/~whistler/idna/IDN_Always-5.1.0.txt<http://www.unicode.org/%7Ewhistler/idna/IDN_Always-5.1.0.txt>
>
> http://www.unicode.org/~whistler/idna/IDN_Never-3.2.0.txt<http://www.unicode.org/%7Ewhistler/idna/IDN_Never-3.2.0.txt>
> http://www.unicode.org/~whistler/idna/IDN_Never-4.0.0.txt<http://www.unicode.org/%7Ewhistler/idna/IDN_Never-4.0.0.txt>
> http://www.unicode.org/~whistler/idna/IDN_Never-4.0.1.txt<http://www.unicode.org/%7Ewhistler/idna/IDN_Never-4.0.1.txt>
> http://www.unicode.org/~whistler/idna/IDN_Never-4.1.0.txt<http://www.unicode.org/%7Ewhistler/idna/IDN_Never-4.1.0.txt>
> http://www.unicode.org/~whistler/idna/IDN_Never-5.0.0.txt<http://www.unicode.org/%7Ewhistler/idna/IDN_Never-5.0.0.txt>
> http://www.unicode.org/~whistler/idna/IDN_Never-5.1.0.txt<http://www.unicode.org/%7Ewhistler/idna/IDN_Never-5.1.0.txt>
>
> Those are all diffable plain text files, in the same general
> format as used for other Unicode property files.
>
> If you examine them carefully, you will discover that they
> exhibit the very kind of cross-version stability that you
> and others on the idna-update list have been concerned
> about, namely:
>
>  1. Once a character is classed as NEVER, it does not
>     change that status in any later version of Unicode.
>
>  2. Once a character is classed as ALWAYS, it does not
>     change that status in any later version of Unicode.
>
> To obtain this result for updating from Unicode 5.0
> to Unicode 5.1, nothing at all is required in the way
> of adjusting what Mark and I suggested for the table
> derivation.
>
> To make this work *retroactively* from Unicode 5.0 all
> the way back to Unicode 3.2, it is necessary to add a
> small number of characters to the exception_NEVER_list
> that Mark and I discussed, because of a few casing changes
> that happened as of Unicode 4.0 and a few other stray
> category changes. That list is very small, impacts
> unimportant characters only -- and I'd be glad to share
> the exact details if you are interested.
>
> The point, however, is that starting with a carefully
> planned out table definition from the beginning, not only
> is it possible to maintain complete backwards compatibility
> for the NEVER and ALWAYS classes for IDN from
> Unicode 5.0 and moving forward -- one can even set things
> up so as to guarantee retroactive backwards compatibility
> for those classes all the way back to Unicode 3.2.
>
> While we may obviously still want to discuss the details
> and may differ in our opinions about the exact list of
> scripts that belong in the Historic_Scripts category
> or the exact small list of characters that should
> (like MIDDLE DOT) be in the exception_ALWAYS_list, and
> so forth, I think that the above posted files should serve
> as an existence proof that it is possible to define the
> derivation of this table in such a way as to guarantee
> backwards compatibility of the property used by IDNA between
> versions of Unicode.
>
> Regards,
>
> --Ken
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>



-- 
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20071221/37e9011a/attachment.html


More information about the Idna-update mailing list