Generic variant subtags in RFC 3066bis

John Cowan jcowan at reutershealth.com
Tue Apr 19 15:06:15 CEST 2005


Michael Everson scripsit:

> >Ietf-languages folks: do you have any objections to this idea?
> 
> Yes. These categories are neither exhaustive nor precise.

Marion Gunn scripsit:

> > Ietf-languages folks: do you have any objections to this idea?
> 
> I would.

Caoimhin O Donnaile scripsit:

> Aren't such terms highly specific to each language, subject to the 
> conventions of scholarship within that language, and non-orthogonal?

Well, I see that all the Gaels are marching against me, in a display of
unanimity unprecedented since the battle of Clontarf or before (and me
the grandson of a Mayo man *snif*).  But, like the Connecticut Yankee,
I'll argue my case anyway.

> E.g. "Middle Irish" is 900-1200 roughly, whereas "Middle English"
> is 1100-1500.  "Clasical Irish" is 1200-1650, whereas "Classical Latin" 
> is much earlier(!), and the term "Classical English" is not normally 
> used at all.

These facts are of course undeniable.  But a variant subtag is just a
label, after all!  Its meaning is entirely relative to the meaning of the
other three components (language, script, region) of an RFC 3066bis tag.
It has no meaning of its own as they do.

Peter Constable points out that ISO 693-3 will handle the Old and Middle
stuff, as a look at his draft indeed indicates, so I'll focus on the
geographical subtags.

Consider American English, en-us.  This comes in three basic dialect
groups, normally called Northern, Midland, and Southern.  People who need
to label spoken content to that degree of granularity would naturally
wish to use tags like en-us-northern, en-us-midland, and en-us-southern
for the purpose, and could (when RFC 3066bis becomes effective) register
the subtags as such.

But then come the students of Karelian (ISO 693-3 code krl) who note that
their language is also divided into a Northern and a Southern dialect!
What to do?  Well, we could use the words for "northern" and "southern"
in Karelian (or Russian -- there's a lot of Karelian scholarship in
Russian), if they aren't too long.  But that would require additional
explanations, not to mention they would most likely have to be mutilated
to fit into the draconian a-z restrictions of language tags.

Then come all the other languages that have Northern and Southern
dialects.  I suggest, therefore, just short-circuiting this process by
registering -northern and -southern and a few more a priori, and declaring
them open for use with any language that has geographically-labeled
dialects, so that en-us-northern is legitimate and krl-northern is too.
Of course it will be possible to produce rubbish like eo-western, but
that's an unavoidable accompaniment of generativity.

-- 
John Cowan  www.reutershealth.com  www.ccil.org/~cowan  jcowan at reutershealth.com
Arise, you prisoners of Windows / Arise, you slaves of Redmond, Wash,
The day and hour soon are coming / When all the IT folks say "Gosh!"
It isn't from a clever lawsuit / That Windowsland will finally fall,
But thousands writing open source code / Like mice who nibble through a wall.
        --The Linux-nationale by Greg Baker


More information about the Ietf-languages mailing list