[Ltru] Re: Solving the UTF-8 problem
Chris.Newman at sun.com
Tue Jul 10 08:38:10 CEST 2007
Doug Ewell wrote on 7/2/07 23:00 -0700:
> Stephane Bortzmeyer <bortzmeyer at nic dot fr> wrote:
>>> 3. UTF-8 can't be read on some, espcially older, computer systems (Frank
>>> Ellermann, months ago, and CE Whitehead).
>> So, I basically agree that UTF-8 for the registry is better but I do not
>> want to see bold sentences like "Anyone but Frank Ellermann can run a full
>> UTF-8 environment by now". This is not true.
> You're correct. I restated three objections to converting the Registry to
> UTF-8, and tried to show why they don't outweigh the advantages of
> converting. All three are, in fact, true:
> 1. UTF-8 doesn't play well with e-mail.
> 2. Converting will break processors that expect only ASCII.
> 3. Some computers can't display UTF-8.
> But we can work out the e-mail problem, and the breakage to processors is no
> worse than adding new fields (nor are there that many fully-conformant
> processors to be fixed). And the display problem is really not as much of a
> showstopper as it is being portrayed. People are saying that the hex escapes
> are a display problem too, and adding "Arua" and "Aruá (Arua)" to the
> Registry is going to confuse a LOT of people, no matter how many comments we
UTF-8 has been the recommend charset for Internet interchange since RFC 2277.
Our past experience with ASCII encodings of non-ASCII text in the IETF has been
questionable. RFC 2047, 2231, IMAP modified-UTF-7, and quoted-printable have
all had mixed results. Meanwhile, UTF-8 based IETF protocols have been less
problematic from an interoperability viewpoint. The EAI WG is putting together
an experiment to try UTF-8 in email headers and addresses and that will
increase the pressure to update email infrastructure.
Rough edges are inevitable during the adoption of new technology, but where do
we want to be 5-10 years from now? What's the least painful path to get there?
More information about the Ietf-languages