draft-05: editorial comments (1)

Harald Tveit Alvestrand harald at alvestrand.no
Thu Sep 2 17:04:57 CEST 2004

(apologies for coming in a week late and a dollar short..)

--On tirsdag, august 10, 2004 22:36:29 -0700 Peter Constable 
<petercon at microsoft.com> wrote:

> In section 5:
> "The issue of deciding upon the rendering of characters based on the
> language tag is not addressed in this memo; however, if different spans
> of text are not marked with font information, it may be useful to
> provide the ability to mark spans of text with language. For example, a
> rendering engine may use that information in deciding which font to use
> in displaying Han-based ideographs when it encounters mixed
> Japanese-Chinese text that has no attached font information."
> I had two reactions reading this:
> - Even if spans of text are not marked wrt font, it may still be useful
> to mark spans for language (one font may support alternate typographic
> conventions for different languages).
> - I'm somewhat inclined to say this paragraph is out of place -- that
> details of language-specific are not really character set issues and are
> no more needed than details of language-specific word-boundary detection
> or any other tailored processing. I'll accept that wrt CJK
> language-specific rendering has for many years been closely linked with
> charset issues.
> Both of these reactions might have been mitigated if the paragraph were
> organized differently, giving the CJK context right from the outset.

The antedecent of this paragraph was in fact added to RFC 1766 at the 
behest of (I believe) Ran Atkinson, who wanted to point out that charset 
(especially after the Han unification in Unicode) did not contain 
information enough to render text "properly".

That version was:

   The issue of deciding upon the rendering of a character set based on
   the language tag is not addressed in this memo; however, it is
   thought impossible to make such a decision correctly for all cases
   unless means of switching language in the middle of a text are
   defined (for example, a rendering engine that decides font based on
   Japanese or Chinese language will fail to work when a mixed
   Japanese-Chinese text is encountered)

At that time (1995), charset tagging was hotly debated, font tagging was 
not at all common, and language tagging was in its infancy (after all, we 
were just defining the tags).

Language-specific fonts were extremely comon, but the concept of a font 
that contained language-specific processing within the font was completely 
foreign to the debaters at the time.

In this day of Unicode and years of experience with unified Han, I think 
what the paragraph was trying to say has become common wisdom; it may no 
longer need to be said.

But I do feel a bit nostalgic about it :-)

More information about the Ietf-languages mailing list