Solving the UTF-8 problem
Stephane Bortzmeyer
bortzmeyer at nic.fr
Mon Jul 2 15:55:33 CEST 2007
On Sun, Jul 01, 2007 at 03:58:48PM -0700,
Doug Ewell <dewell at roadrunner.com> wrote
a message of 161 lines which said:
> Another possibility is to have IANA post an official version of the
> Registry in one encoding, such as UTF-8, and additional, unofficial
> versions in other encodings, such as Latin-1 or hex NCRs.
Why not? Currently, we do exactly the opposite: IANA publishes the
official registry in hex NCR
(http://www.iana.org/assignments/language-subtag-registry) and
langtag.net publishes an unofficial version in UTF-8
(http://www.langtag.net/registries/language-subtag-registry.utf8).
> Potential problems with this approach are unintentional mismatches
> between the versions (I caught one of these problems for the ISO
> 639-3 people recently)
I do not get it. If the unofficial version is produced by a program,
how can a mismatch exist (unless there is a bug in the program)?
And if the unofficial version is done by hand, should we tell ISO
639-3 that computers are better than people for boring and repetitive
tasks?
More information about the Ietf-languages
mailing list