Historic scripts as MAYBE?

Mark Davis mark.davis at icu-project.org
Mon Apr 28 04:42:17 CEST 2008

On Sun, Apr 27, 2008 at 7:20 PM, Paul Hoffman <phoffman at imc.org> wrote:

>  At 7:04 PM -0700 4/27/08, Mark Davis wrote:
> If historic scripts are to be excluded, the up-to-date list recommended by
> the consortium for U5.1 is at
> http://www.unicode.org/reports/tr31/#Specific_Character_Adjustments
> Needing to be picky: where in that section are you talking about? If you
> mean Table 4, I note that the table's description is:

Yes, table 4.

> Some characters are not in modern customary use, and thus implementations
> may want to exclude them from identifiers. These are historic and obsolete
> scripts, scripts used mostly liturgically, and regional scripts used only in
> very small communities or with very limited current usage.
> That's quite different than the just being archaic/historic.

True, but remember that there is no bright line with "archaic/historic". If
you mean: "nobody uses it", then no script qualifies!

> (BTW, I'm strongly against restoring MAYBE, for a number of reasons
> already discussed.
> No one has suggested that we do so.
> Haven't heard back from Patrik as to why, though, we couldn't in
> exceptional circumstances move characters from DISALLOWED.
> I thought his answer to you was complete. I also agree with him.

His responses were not particularly informative, such as:

This makes it impossible for application developers to filter, and there is
no way it is possible to control "registration" of DISALLOWED codepoints,
and the latter is the reason why application developers have to filter out
DISALLOWED codepoints completely.


It is hardly impossible for application developers to filter -- and anyone
who filters with a hard-coded list is just simply going to miss anything
that becomes valid after that hard-coded list is incorporated into the
program, in any event.

My earlier remarks were:

> Stability of invalid labels is actually not in play anyway; as new
characters get assigned, once-invalid labels become valid, and on a regular
basis as ever more characters are added to Unicode.

I have gotten no reply back from my message of 5 days ago ("Re: Stability of
valid IDN labels"). Without some concrete user scenerios making a compelling
case, all we have a bald statement about unnamed "application creators".
That is hardly the way to go about doing a specification.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20080427/337c9827/attachment-0001.html

More information about the Idna-update mailing list