Cf?

Erik van der Poel erikv at google.com
Tue Mar 18 17:24:30 CET 2008


On Tue, Mar 18, 2008 at 8:58 AM, Paul Hoffman <phoffman at imc.org> wrote:
> At 8:33 AM -0700 3/18/08, Erik van der Poel wrote:
>  >Instead of trying to make a decision for each Cf character, putting
>  >all of them in CONTEXTO (and two in CONTEXTJ) seems like it would
>  >leave the door open to those characters (since we can write contextual
>  >rules for them later).
>
>  That's an option. I'm still hesitant to put anything in to CONTEXT*
>  until we see really what will happen to those characters. So far, the
>  protocol punts on that with a TBD on the regexp.
>
>
>  >Making them DISALLOWED now makes it harder to allow them later. No?
>
>  Yes, but so does making a CONTEXT* mapping that we want to change
>  later. Any change to a character other than taking it out of
>  UNASSIGNED, exactly once, is an equivalent instability.
>
>
>  >Anyway, I don't really care about the rest of Cf. It seems like there
>  >is a more pressing need to make a decision about U+200C.
>
>  I see three sequential decisions here:
>  - Does anything in {Cf} need saving?

U+200C seems to be needed.

>  - If so, what is the full list?

U+200C. If anybody can make a case for U+200D, go for it.

>  - What is the context for each item in the list?

You mean, what *are* the contexts for each item. Unfortunately, U+200C
is anything but simple.

It might also be a good idea to discuss how we would attempt to
transition. On the HTML side (I know, I know, I shouldn't focus on
HTML, but...), it would be better to stick to the old mapping for
U+200C (i.e. map "to nothing", i.e. delete). Content providers would
then use U+200C only inside A-labels (encoded in Punycode). Browsers
like MSIE would then have to be updated to allow such A-labels, both
for resolution and for display.

The browser's address bar is much harder. If the user types (or
pastes) U+200C, what should the browser do? Try deleting it first, and
if that doesn't resolve, try it with U+200C? Shudder.

Erik


More information about the Idna-update mailing list