Dot-mapping
Harald Tveit Alvestrand
harald at alvestrand.no
Tue Dec 11 20:01:00 CET 2007
--On 11. desember 2007 16:56 +0900 fujiwara at jprs.co.jp wrote:
> And more, the candidate dot-like characters are already listed
> in Unicode 5.0 standard. ( grep "FULL STOP" UnicodeData.txt )
>
> They all are marked as "NEVER" in draft-faltstrom-idnabis-tables-03.txt.
> There is no collision/conflict.
>
> 002E; # FULL STOP
> 0589; # ARMENIAN FULL STOP
> 06D4; # ARABIC FULL STOP
> 0701; # SYRIAC SUPRALINEAR FULL STOP
> 0702; # SYRIAC SUBLINEAR FULL STOP
> 1362; # ETHIOPIC FULL STOP
> 166E; # CANADIAN SYLLABICS FULL STOP
> 1803; # MONGOLIAN FULL STOP
> 1809; # MONGOLIAN MANCHU FULL STOP
> 2CF9; # COPTIC OLD NUBIAN FULL STOP
> 2CFE; # COPTIC FULL STOP
> 3002; # IDEOGRAPHIC FULL STOP
> FE12; # PRESENTATION FORM FOR VERTICAL IDEOGRAPHIC FULL STOP
> FE52; # SMALL FULL STOP
> FF0E; # FULLWIDTH FULL STOP
> FF61; # HALFWIDTH IDEOGRAPHIC FULL STOP
Of course there's also DIGIT ONE FULL STOP and friends.... but those are
compatibility characters, so an user interface that does mapping will
presumably remove them before they meet the NEVER barrier of IDNAbis.
Of more confusing interest is things like 22C5 DOT OPERATOR or 30FB
KATAKANA MIDDLE DOT, or the already famous 00B7 MIDDLE DOT (which the
Catalans say they need INSIDE the labels).
Harald
More information about the Idna-update
mailing list