Public Review Issue #181: Changing General Category of Twelve Characters
Ken Whistler
kenw at sybase.com
Tue Apr 5 21:00:35 CEST 2011
Patrik,
I see that Mark responded on this thread, but didn't actually answer the
question.
For IDNA 2008 purposes, the relevant point to look at is Section B of
RFC 5892,
not Section A.
All twelve of these characters are superscript or subscript characters
which have
compatibility decompositions to single letters. Because of this, they
are all
"unstable" by the criterion in Section B. As a result they are all
DISALLOWED
in IDNA 2008 (of whatever vintage) and will stay that way, because of the
Unicode normalization stability guarantees.
Changing their General Category values from gc=Ll to gc=Lm has no impact
whatsoever on the bottom line of whether these twelve characters are
allowed in IDN's. (They aren't.)
--Ken
On 4/4/2011 7:43 AM, Mark Davis ☕ wrote:
> That was one of the considerations in the discussion; the effect on
> identifiers (IDNA and others).
>
> Mark
> //
> 2011/4/4 Patrik Fältström <patrik at frobbit.se <mailto:patrik at frobbit.se>>
>
> I also would like to get a firm response from Unicode people as
> well, BUT, by just quickly looking at the change, I can only see
> the change gc=Ll to gc=Lm be something that have to do with IDNA2008.
>
> And as rule A of IDNA2008 is the following:
>
> A: General_Category(cp) is in {Ll, Lu, Lo, Nd, Lm, Mn, Mc}
>
> ...i.e. both Ll and Lm are accepted, this change should NOT have
> any impact on IDNA2008.
>
> So I am not as worried as I was when I first saw that Gc was
> proposed to be changed for twelve(!) characters!!!
>
> Patrik
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/idna-update/attachments/20110405/6cd4b0f7/attachment.html>
More information about the Idna-update
mailing list