Converting non-ASCII to ASCII
dewell at roadrunner.com
Mon Jun 25 02:27:38 CEST 2007
CE Whitehead <cewcathar at hotmail dot com> wrote:
> Here are the remaining subtags (from
> where there are escape sequences in
> the description
It's really not necessary to go through this exercise on the list. Most of
us are capable of searching a text file for the character '&' and able to
determine that the ASCII equivalent for "o-with-circumflex" is "plain o".
A less trivial question is how we are going to resolve the ongoing issue of
"preserve non-ASCII characters" versus "hex NCRs are not human-friendly"
versus "UTF-8 doesn't make it through some systems without being corrupted."
There is a discussion being held in the LTRU Working Group, for at least the
third time, over changing the format of the Registry to UTF-8. That is the
proper place to hold that discussion, not this list.
The question of providing pure-ASCII transliterations for every string in
the Registry that includes a hex NCR, even something like Provençal, is
an operational detail, and does belong on this list IMHO. My opinion is
that we need to be able to represent non-ASCII in the Registry by some
means, either hex NCRs or UTF-8 or something, and that it's not necessary or
feasible to come up with a pure-ASCII transliteration for everything. But
again, this is the place to discuss additions and changes to the Registry
contents, and LTRU is the place to discuss changing the rules.
Doug Ewell * Fullerton, California, USA * RFC 4645 * UTN #14
More information about the Ietf-languages