Converting non-ASCII to ASCII

Doug Ewell dewell at roadrunner.com
Tue Jun 26 16:31:15 CEST 2007


CE Whitehead <cewcathar at hotmail dot com> wrote:

> I feel that an ideal registry would include the escape sequences (for 
> people whose browsers do not disply unicode; some of the browsers I use 
> cannot be set to display unicode; sorry to say), the transliterations, and 
> the utf-8 characters (some of which would appear as rectangles and 
> question marks in my browser), but ideally could be formatted in utf-8.

I would strongly oppose putting both UTF-8 *and* hex NCRs in the Registry. 
Hex NCRs are what you use when you can't use UTF-8.  (I do mix &nbsp; with 
real UTF-8 in my Web pages, but only so I can see the no-break spaces in the 
editor and visually distinguish them from ordinary spaces.)

> So should I or should I not submit a change to my comments field (which is 
> already mixed, eme is ascii transliteration; while l'acad&#xE9;mie 
> fran&#xE7;oise makes use of escape sequences??

If it is decided to add a transliteration for this and other "obvious" 
non-ASCII text, there should be no need for the original submitter to send a 
revised form.  As I said before, any of us can succesfully remove an acute 
accent from an "e".  Any changes would be subject to the 1-week review 
period anyway.  I don't think there are any difficult cases left, such as 
Resat's alternative names for "baku1926".

> Also be-tarask was all done as ascii transliterations; we argued about 
> which transliteration to use (in this case there was some discussion); do 
> we now need to include the utf-8 characters/escape sequences for it???

It's already a transliteration.  I think we are making this way too 
complicated.

> I sort of like having the hex escapes too because you can look those up to 
> see the characters in pdf in the unicode character charts even when your 
> browser settings cannot be changed so as to display all the characters.  I 
> don't know how much use a person could make of these though in terms of 
> using them to search or whatever.

This is 2007.  It shouldn't be hard to find a UTF-8-enabled editor or file 
viewer for just about any platform.

--
Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14
http://users.adelphia.net/~dewell/
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages



More information about the Ietf-languages mailing list