[Suppress-Script] Initial list of 300 languages

Ciarán Ó Duibhín ciaran at oduibhin.freeserve.co.uk
Tue Mar 14 03:45:50 CET 2006


Thanks to Caoimhín's researches, and the comments of others, I'm beginning
to see what the script tag is meant to do.  It seems Latg is really a font
tag, or rather, it is meant to tag a bunch of fonts with "insular" shapes.
It would cross-classify with charset.

But there are a couple of things to raise before I can be sure of that.  One
is the phrase "writing tradition" used by John.  From other things he has
said, I imagine he meant just "insular fonts" but the term suggests the two
main writing traditions in Irish, which differ in shapes (Gaelic vs Latin),
in charset (dotted consonants vs none), as well as in spelling, grammar and
vocabulary (traditional vs so-called "standard", in reality, dummed down for
classroom use by second-language learners).  If the Latn vs Latg distinction
is used to distinguish these writing systems, it certainly involves
charsets, contrary to what John has said elsewhere.  So I don't think he
meant to use the tag for that, but it wasn't clear.  Anyway, as he points
out, all kinds of hybrids are possible, so a simple tag Latg could not
usefully describe the situation.

Another thing is the library example.  There are indeed many books in Irish
which exist in "old" and "new" editions, corresponding to these two main
writing traditions.  It's a difference that might be worth tagging, but as
the difference in shapes is only one of the differences between them, and
arguably the least important one, using a "script" tag to distinguish them
is not appropriate.

So is Latg useful as a tag for a bunch of fonts with similar shapes?  I
don't see much need for it, but I suppose it is harmless enough, as long as
people are given clear and accessible directions in using it, without having
to digest ISO 15924!  In the absence of such directions, as we have seen,
people are liable to conclude that it refers to something more substantial,
and may use it thinking they are tagging more than they are.

With this interpretation of Latg, my feeling on the question of the default
script for ga, sga, and mga is that all should default to Latn if the
charset is ISO 8859 but not 8859-14.  For unicode, it should be Latn if the
text contains no dotted-consonant characters.  For ISO 8859-14, and for
unicode using dotted-consonant characters, most people would probably prefer
Latg (though I wouldn't, at least for ga).  If such conditionality is not
possible, a universal default of Latn may be acceptable.

Ciarán Ó Duibhín.





More information about the Ietf-languages mailing list