Changing DISALLOWED (was Re: Reserved general punctuation)
Mark Davis
mark.davis at icu-project.org
Thu May 1 04:17:56 CEST 2008
I think the following misrepresents my position:
"It is that area of flexibility with CONTEXT, especially
CONTEXT-OTHER, where my view that "Disallowed" is permanent,
with no path (or a very difficult one) out of that category,
converges with what I understand of Mark's desire to make
migration out of DISALLOWED relatively easy."
I'm not looking to make it easy. I think there are a few possible positions
we could take in IDNAbis.
1. We say that once DISALLOWED, always DISALLOWED.
This is not a firm promise, because an obsoleting RFC could change it, but
would certainly set a very high bar.
2. We say that characters can only be removed from DISALLOWED by an
obsoleting RFC.
A slightly lower bar. While it could be changed, it would certainly be
difficult.
3. We say that characters can only be removed from DISALLOWED by the
committee/mechanism that controls CONTEXT/exceptions, and only in extremis.
This should, in my view, also be quite difficult; not quite to the same
level as an RFC, but carefully, with sufficient time for deliberation, with
solid consensus by a broad set of experts.
4. We say that characters can only be removed from DISALLOWED by the
committee/mechanism that controls CONTEXT/exceptions, and but that committee
is not designed to be conservative.
This, I think, would be a very bad choice. My presumption has always been
that the committee/mechanism that controls CONTEXT/exceptions should be
extremely conservative in its changes; that changes are only made very
carefully.
I think #3 would be the best, and #2 acceptable, while #1 and #4 are
extremes that could cause problems.
Mark
On Wed, Apr 30, 2008 at 6:02 AM, John C Klensin <klensin at jck.com> wrote:
>
>
> --On Wednesday, 30 April, 2008 04:16 -0700 Vint Cerf
> <vint at google.com> wrote:
>
> > My naïve assumption is that anything unassigned has the
> > potential to become assigned so we need to have a state in
> > which the code point is not allowed for current use but could
> > be permitted at a later time. Do we have the semantics to
> > accommodate that? V
>
> Short answer: No. I presume that is why we are having this
> discussion.
>
> Longer answer:
>
> While we have concluded that the problems it would cause
> outweigh the advantages, these areas of uncertainty are a large
> part of what motivated having MAYBE categories.
>
> I think that putting anything into UNASSIGNED that isn't
> actually unassigned (i.e., given no code point assignment in the
> then-current version of Unicode) is looking for trouble. As you
> point out, such code points have the potential to become
> assigned. While one might make some educated guesses from the
> block context in which the code point is located, we can't
> predict, with 100% certainty, the properties that a code point
> will have if and when it is assigned in the future.
>
> So, for a code point that is actually assigned, I think we have
> only three choices:
>
> * Allow it, as Protocol-Valid. For general punctuation
> this is, I hope obviously, not a good idea.
>
> * Disallow it and assume that, if we discover we need it
> enough later, we will do whatever drastic revisions or
> disaster corrections are required. Of course, that sets
> a very high bar to ever allowing those characters, but
> that may not be unreasonable.
>
> * Assign it to "context required" but do not assign a
> rule. Under the current proposed model, that means
> that it can neither be registered nor looked up. On the
> other hand, we could, in the future, allow it in the
> cases where it is actually required by assigning an
> appropriate rule and then waiting for software to be
> upgraded (something that would presumably happen more
> quickly in places where the character is important than
> in places where it isn't).
>
> It is that area of flexibility with CONTEXT, especially
> CONTEXT-OTHER, where my view that "Disallowed" is permanent,
> with no path (or a very difficult one) out of that category,
> converges with what I understand of Mark's desire to make
> migration out of DISALLOWED relatively easy. In the middle
> ground, we try to identify the characters about which we may be
> uncertain and identity them as CONTEXTO with no expectation of
> assigning rules unless it turns out that they are really needed.
> That approach assume that we can anticipate characters that
> _might_ need to be moved, i.e., characters about which are are
> not certain that DISALLOWED is globally correct. I think that
> is probably correct. Indeed, I believe that, if it is not
> correct, this entire approach is built on a house of cards and
> we may need to drop it.
>
> And, FWIW, the argument for putting Cf into CONTEXTO precisely
> follows the reasoning above -- these odd and sometimes-invisible
> cases (see U+2060, 2062..2064; WORD JOINER, INVISIBLE TIMES/
> SEPARATOR/ PLUS) are precisely the sorts of thing that someone
> might, conceivably, argue passionately are required in some IDN
> contexts. If I correctly understand the use of these
> characters, my own view is that I would argue strongly about
> permitting them. But I think it would be better to have that
> argument on the basis of substantive requirement to have the
> characters in IDNs versus risks and complexity and not on the
> basis of an artifact of how we had defined things.
>
> john
>
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
--
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20080430/4408c693/attachment-0001.html
More information about the Idna-update
mailing list