Dot-mapping

Harald Tveit Alvestrand harald at alvestrand.no
Tue Dec 11 20:01:00 CET 2007



--On 11. desember 2007 16:56 +0900 fujiwara at jprs.co.jp wrote:

> And more, the candidate dot-like characters are already listed
> in Unicode 5.0 standard.  ( grep "FULL STOP" UnicodeData.txt )
>
> They all are marked as "NEVER" in draft-faltstrom-idnabis-tables-03.txt.
> There is no collision/conflict.
>
> 002E; # FULL STOP
> 0589; # ARMENIAN FULL STOP
> 06D4; # ARABIC FULL STOP
> 0701; # SYRIAC SUPRALINEAR FULL STOP
> 0702; # SYRIAC SUBLINEAR FULL STOP
> 1362; # ETHIOPIC FULL STOP
> 166E; # CANADIAN SYLLABICS FULL STOP
> 1803; # MONGOLIAN FULL STOP
> 1809; # MONGOLIAN MANCHU FULL STOP
> 2CF9; # COPTIC OLD NUBIAN FULL STOP
> 2CFE; # COPTIC FULL STOP
> 3002; # IDEOGRAPHIC FULL STOP
> FE12; # PRESENTATION FORM FOR VERTICAL IDEOGRAPHIC FULL STOP
> FE52; # SMALL FULL STOP
> FF0E; # FULLWIDTH FULL STOP
> FF61; # HALFWIDTH IDEOGRAPHIC FULL STOP

Of course there's also DIGIT ONE FULL STOP and friends.... but those are 
compatibility characters, so an user interface that does mapping will 
presumably remove them before they meet the NEVER barrier of IDNAbis.

Of more confusing interest is things like 22C5 DOT OPERATOR or 30FB 
KATAKANA MIDDLE DOT, or the already famous 00B7 MIDDLE DOT (which the 
Catalans say they need INSIDE the labels).

            Harald







More information about the Idna-update mailing list