Disallowing code points

Kenneth Whistler kenw at sybase.com
Sat Jul 18 01:40:15 CEST 2009


Gervase noted (re the intersection of Mozilla Character
Blocklist with IDNA 2008 PVALID or CONTEXTO characters):

> http://macchiato.com/idna/idna-info.html
> I get the below results. Headlines: five are PVALID 
> (\u01C3\u02D0\u0337\u0338\u3033) and one is CONTEXT0 (\u05F4).
> 
> PVALID:
> \u01C3 LATIN LETTER RETROFLEX CLICK (exclamation mark)
> \u02D0 MODIFIER LETTER TRIANGULAR COLON (colon)
> \u0337 COMBINING SHORT SOLIDUS OVERLAY (slash)
> \u0338 COMBINING LONG SOLIDUS OVERLAY (slash)
> \u3033 VERTICAL KANA REPEAT MARK UPPER HALF (slash)

Mark responded:

"My take is that all of these are legitimate characters, 
and should just be PVALID."

I agree about the first four -- and note particularly that
U+0337 and U+0338 are combining characters.

I disagree about U+3033, which is the subject of a separate
thread, and which should just be made DISALLOWED (along
with the other vertical kana repeat marks) in the exceptions
list. That would remove one character from the
intersection with the Mozilla blocklist.

--Ken

> 
> CONTEXT0:
> \u05F4 HEBREW PUNCTUATION GERSHAYIM (double quotes)



More information about the Idna-update mailing list