Converting non-ASCII to ASCII
Doug Ewell
dewell at roadrunner.com
Mon Jun 25 02:27:38 CEST 2007
CE Whitehead <cewcathar at hotmail dot com> wrote:
> Here are the remaining subtags (from
> http://www.iana.org/assignments/language-subtag-registry)--
> where there are escape sequences in
> the description
It's really not necessary to go through this exercise on the list. Most of
us are capable of searching a text file for the character '&' and able to
determine that the ASCII equivalent for "o-with-circumflex" is "plain o".
A less trivial question is how we are going to resolve the ongoing issue of
"preserve non-ASCII characters" versus "hex NCRs are not human-friendly"
versus "UTF-8 doesn't make it through some systems without being corrupted."
There is a discussion being held in the LTRU Working Group, for at least the
third time, over changing the format of the Registry to UTF-8. That is the
proper place to hold that discussion, not this list.
The question of providing pure-ASCII transliterations for every string in
the Registry that includes a hex NCR, even something like Provençal, is
an operational detail, and does belong on this list IMHO. My opinion is
that we need to be able to represent non-ASCII in the Registry by some
means, either hex NCRs or UTF-8 or something, and that it's not necessary or
feasible to come up with a pure-ASCII transliteration for everything. But
again, this is the place to discuss additions and changes to the Registry
contents, and LTRU is the place to discuss changing the rules.
--
Doug Ewell * Fullerton, California, USA * RFC 4645 * UTN #14
http://users.adelphia.net/~dewell/
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages
More information about the Ietf-languages
mailing list