Public Review Issue #181: Changing General Category of Twelve Characters

Patrik Fältström paf at cisco.com
Tue Apr 5 21:14:51 CEST 2011


Ahh....thanks! Did not (as you saw) think about this at all.

   Patrik

On 5 apr 2011, at 21.00, Ken Whistler wrote:

> Patrik,
> 
> I see that Mark responded on this thread, but didn't actually answer the question.
> 
> For IDNA 2008 purposes, the relevant point to look at is Section B of RFC 5892,
> not Section A.
> 
> All twelve of these characters are superscript or subscript characters which have
> compatibility decompositions to single letters. Because of this, they are all
> "unstable" by the criterion in Section B. As a result they are all DISALLOWED
> in IDNA 2008 (of whatever vintage) and will stay that way, because of the
> Unicode normalization stability guarantees.
> 
> Changing their General Category values from gc=Ll to gc=Lm has no impact
> whatsoever on the bottom line of whether these twelve characters are
> allowed in IDN's. (They aren't.)
> 
> --Ken
> 
> On 4/4/2011 7:43 AM, Mark Davis ☕ wrote:
>> That was one of the considerations in the discussion; the effect on identifiers (IDNA and others).
>> 
>> Mark
>> //
> 
>> 2011/4/4 Patrik Fältström <patrik at frobbit.se <mailto:patrik at frobbit.se>>
>> 
>>    I also would like to get a firm response from Unicode people as
>>    well, BUT, by just quickly looking at the change, I can only see
>>    the change gc=Ll to gc=Lm be something that have to do with IDNA2008.
>> 
>>    And as rule A of IDNA2008 is the following:
>> 
>>    A: General_Category(cp) is in {Ll, Lu, Lo, Nd, Lm, Mn, Mc}
>> 
>>    ...i.e. both Ll and Lm are accepted, this change should NOT have
>>    any impact on IDNA2008.
>> 
>>    So I am not as worried as I was when I first saw that Gc was
>>    proposed to be changed for twelve(!) characters!!!
>> 
>>      Patrik
>> 
> 



More information about the Idna-update mailing list