[precis] Category changes with Unicode 6.3

Patrik Fältström paf at frobbit.se
Wed Oct 16 09:16:08 CEST 2013


Yeah, I had to do it anyways -- as I am expert reviewer of the IANA tables. New versions of the IANA tables where created by IANA the other day, and I just approved them as they match my own calculations. I.e. we have two completely independent implementations of IDNA2008 (one at IANA, one that I have -- see link below) and I compare the output of the two. If the output matches, then we are pretty sure we are correct.

   Patrik

On 16 okt 2013, at 10:04, "Martin J. Dürst" <duerst at it.aoyama.ac.jp> wrote:

> Hello Patrick,
> 
> Many thanks for checking. Great to see that everything is okay.
> 
> Regards,   Martin.
> 
> On 2013/10/16 14:40, Patrik Fältström wrote:
>> Executive summary: Does not impact IDNA2008.
>> 
>> Longer explanation:
>> 
>> On 16 okt 2013, at 05:34, Martin J. Dürst<duerst at it.aoyama.ac.jp>  wrote:
>> 
>>> Excuse me if this has been checked and/or discussed already, but I just downloaded the Unicode 6.3 version (officially published a few days ago) of http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt and found several changes in character classification:
>>> 
>>> OLD
>>> 180E;MONGOLIAN VOWEL SEPARATOR;Zs;0;WS;;;;;N;;;;;
>>> NEW
>>> 180E;MONGOLIAN VOWEL SEPARATOR;Cf;0;BN;;;;;N;;;;;
>> 
>> No change:
>> 
>> $ grep '^180E;' ../6.[23].0/allcodepoints.txt
>> ../6.2.0/allcodepoints.txt:180E;DISALLOWED;I;C;MONGOLIAN VOWEL SEPARATOR
>> ../6.3.0/allcodepoints.txt:180E;DISALLOWED;I;C;MONGOLIAN VOWEL SEPARATOR
>> 
>>> OLD
>>> 1A1B;BUGINESE VOWEL SIGN AE;Mc;0;L;;;;;N;;;;;
>>> NEW
>>> 1A1B;BUGINESE VOWEL SIGN AE;Mn;0;NSM;;;;;N;;;;;
>> 
>> No change:
>> 
>> $ grep '^1A1B;' ../6.[23].0/allcodepoints.txt
>> ../6.2.0/allcodepoints.txt:1A1B;PVALID;I;A;BUGINESE VOWEL SIGN AE
>> ../6.3.0/allcodepoints.txt:1A1B;PVALID;I;A;BUGINESE VOWEL SIGN AE
>> 
>>> OLD
>>> 2308;LEFT CEILING;Sm;0;ON;;;;;Y;;;;;
>>> 2309;RIGHT CEILING;Sm;0;ON;;;;;Y;;;;;
>>> 230A;LEFT FLOOR;Sm;0;ON;;;;;Y;;;;;
>>> 230B;RIGHT FLOOR;Sm;0;ON;;;;;Y;;;;;
>>> NEW
>>> 2308;LEFT CEILING;Ps;0;ON;;;;;Y;;;;;
>>> 2309;RIGHT CEILING;Pe;0;ON;;;;;Y;;;;;
>>> 230A;LEFT FLOOR;Ps;0;ON;;;;;Y;;;;;
>>> 230B;RIGHT FLOOR;Pe;0;ON;;;;;Y;;;;;
>> 
>> No change:
>> 
>> $ egrep '^230[89AB];' ../6.[23].0/allcodepoints.txt
>> ../6.2.0/allcodepoints.txt:2308;DISALLOWED;I;;LEFT CEILING
>> ../6.2.0/allcodepoints.txt:2309;DISALLOWED;I;;RIGHT CEILING
>> ../6.2.0/allcodepoints.txt:230A;DISALLOWED;I;;LEFT FLOOR
>> ../6.2.0/allcodepoints.txt:230B;DISALLOWED;I;;RIGHT FLOOR
>> ../6.3.0/allcodepoints.txt:2308;DISALLOWED;I;;LEFT CEILING
>> ../6.3.0/allcodepoints.txt:2309;DISALLOWED;I;;RIGHT CEILING
>> ../6.3.0/allcodepoints.txt:230A;DISALLOWED;I;;LEFT FLOOR
>> ../6.3.0/allcodepoints.txt:230B;DISALLOWED;I;;RIGHT FLOOR
>> 
>>> Can somebody check whether and how they affect IDNA 2008 and/or precis?
>>> 
>>> Again, if that has already been done, sorry for the noise.
>>> 
>>> Regards,   Martin.
>>> 
>>> 
>>> P.S.:
>>> All the other changes in UnicodeData.txt:
>>> 
>>> Change in numerical value only:
>>> 
>>> OLD
>>> 12456;CUNEIFORM NUMERIC SIGN NIGIDAMIN;Nl;0;L;;;;-1;N;;;;;
>>> 12457;CUNEIFORM NUMERIC SIGN NIGIDAESH;Nl;0;L;;;;-1;N;;;;;
>>> NEW
>>> 12456;CUNEIFORM NUMERIC SIGN NIGIDAMIN;Nl;0;L;;;;2;N;;;;;
>>> 12457;CUNEIFORM NUMERIC SIGN NIGIDAESH;Nl;0;L;;;;3;N;;;;;
>> 
>> $ egrep '^1245[67];' ../6.[23].0/allcodepoints.txt
>> ../6.2.0/allcodepoints.txt:12456;DISALLOWED;I;;CUNEIFORM NUMERIC SIGN NIGIDAMIN
>> ../6.2.0/allcodepoints.txt:12457;DISALLOWED;I;;CUNEIFORM NUMERIC SIGN NIGIDAESH
>> ../6.3.0/allcodepoints.txt:12456;DISALLOWED;I;;CUNEIFORM NUMERIC SIGN NIGIDAMIN
>> ../6.3.0/allcodepoints.txt:12457;DISALLOWED;I;;CUNEIFORM NUMERIC SIGN NIGIDAESH
>> 
>>> New characters (my understanding is that these are taken care of automatically):
>>> 
>>> 061C;ARABIC LETTER MARK;Cf;0;AL;;;;;N;;;;;
>> 
>> $ egrep '^061C;' ../6.[23].0/allcodepoints.txt
>> ../6.2.0/allcodepoints.txt:061C;UNASSIGNED;I;J;<reserved>
>> ../6.3.0/allcodepoints.txt:061C;DISALLOWED;I;C;ARABIC LETTER MARK
>> 
>>> 2066;LEFT-TO-RIGHT ISOLATE;Cf;0;LRI;;;;;N;;;;;
>>> 2067;RIGHT-TO-LEFT ISOLATE;Cf;0;RLI;;;;;N;;;;;
>>> 2068;FIRST STRONG ISOLATE;Cf;0;FSI;;;;;N;;;;;
>>> 2069;POP DIRECTIONAL ISOLATE;Cf;0;PDI;;;;;N;;;;;
>> 
>> $ egrep '^206[6789];' ../6.[23].0/allcodepoints.txt
>> ../6.2.0/allcodepoints.txt:2066;UNASSIGNED;I;CJ;<reserved>
>> ../6.2.0/allcodepoints.txt:2067;UNASSIGNED;I;CJ;<reserved>
>> ../6.2.0/allcodepoints.txt:2068;UNASSIGNED;I;CJ;<reserved>
>> ../6.2.0/allcodepoints.txt:2069;UNASSIGNED;I;CJ;<reserved>
>> ../6.3.0/allcodepoints.txt:2066;DISALLOWED;I;C;LEFT-TO-RIGHT ISOLATE
>> ../6.3.0/allcodepoints.txt:2067;DISALLOWED;I;C;RIGHT-TO-LEFT ISOLATE
>> ../6.3.0/allcodepoints.txt:2068;DISALLOWED;I;C;FIRST STRONG ISOLATE
>> ../6.3.0/allcodepoints.txt:2069;DISALLOWED;I;C;POP DIRECTIONAL ISOLATE
>> 
>>    Patrik
>> 
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://www.alvestrand.no/pipermail/idna-update/attachments/20131016/eeb22dd6/attachment-0001.pgp>


More information about the Idna-update mailing list