Language attributes- what are they?

Tex Texin tex at xencraft.com
Sat Jan 1 11:43:48 CET 2005


John,

Well I admit that the solution of calling the English-sorted Old Norse text an
English document is bizarre and unappealing, but let's also admit the example
is a bit odd. What does it mean to sort Old Norse by English rules where  some
of the characters are not used in English? It's even a stretch for Swedish...

But if you don't like the bizarre, the other alternative I proposed is more
reasonable- move to tagging with greater granularity.

Just to be clear, I am not dismissing the example. I can imagine other cases
that might be more common.
I have in mind listing Japanese ideographs for a Chinese audience, requiring
Japanese language tags to insure the right fonts are used as the text is
rendered, and sorting by Chinese pronounciations of the characters (or some
such).

But regardless of the details, the issue is (it seems to me) if a document is
tagged as some language, and sorting within the content is performed in a way
that does not correspond to the language of the document, then would that
surprise readers of that language (as opposed to other languages)? I think the
answer is yes and that sorting is an attribute of language.
(I am finding it hard even to write about sorting text without using
language-based names for the collation or referencing language in some way.)

And if the audience needs a different collation from (one of the ones
associated with) the document language, then it is because their primary
language is different from that of the document, and doesn't undermine the
argument that sorting is an aspect of language.

I know the argument is somewhat tautological...

To be convinced that sorting is not an attribute of language, I would need to
see an example that was completely monolingual but text was sorted in a
different way without speakers of the language objecting (and the text was
sorted based on content and not year, string length, or other collations that
are not related to the strings being collated.)

tex

John Cowan wrote:
> 
> Tex Texin scripsit:
> 
> > It seems to me the document is an Icelandic or English document which contains
> > some Old Norse text. Alternatively, we can tag the Norse text as Old Norse
> > separately from the sorted index tagged as Icelandic.
> 
> How could the anglophone-directed version be an English document?
> It doesn't contain a word of English!  Just the ON text and its sorted
> index (or maybe concordance is a better term).
> 
> > If an author writes for an audience, the content is presumably in the language
> > of the audience, even if there are elements which are in another language.
> 
> Not if *all* of it is in another language.  If I prepare an edition of Plato
> in Greek, then it's in Greek, even if I intend it for my anglophone students.
> (Hypothetical example.)
> 
> --
> John Cowan <jcowan at reutershealth.com>     http://www.reutershealth.com
> I amar prestar aen, han mathon ne nen,    http://www.ccil.org/~cowan
> han mathon ne chae, a han noston ne 'wilith.  --Galadriel, LOTR:FOTR

-- 
-------------------------------------------------------------
Tex Texin   cell: +1 781 789 1898   mailto:Tex at XenCraft.com
Xen Master                          http://www.i18nGuy.com
                         
XenCraft		            http://www.XenCraft.com
Making e-Business Work Around the World
-------------------------------------------------------------


More information about the Ietf-languages mailing list