Peter Constable petercon at
Sat Dec 6 08:06:39 CET 2003

> From: ietf-languages-bounces at [mailto:ietf-languages-
> bounces at] On Behalf Of han.steenwijk at
> Sent: Monday, November 24, 2003 4:34 AM

Several comments; quoting out of order

> As things stand right now (without informative registration), a period
> indicated implicitely by the existence of a second tag that delimits
> period
> inclusively started by a first tag,

I see a problem in this: a system that's interpreting the tags may not
necessarily know about all the tags that are registered. 

> I completely agree with you that registration would be most helpful.

Even apart from years, one of the problem spots in IETF lang tags has
always been that semantics of combinations that can be generated without
registration are not explicitly specified. How is one to know whether
the distinction between de-DE and de-CH implies differences of
vocabulary or differences of spelling (or both)? Without a registration,
what can we assume about the intended meaning of de-US?

I'm not suggesting that registration should be required any time someone
wants to combine elements into a composite tag. My point is that
semantics of combining things is a real issue, even with the bits we've
been using all along; when we throw years into the mix, it's much worse.

(Script codes are not much of a problem since there is little question
as to what in relation to a language they indicate.)

I am disinclined to permit the use of dates without registration. Even
to identify orthography conventions, it may not be obvious or
agreed-upon when a convention began. (Imagine you and I had sources that
disagreed on whether the older German spellings were established in
1904, 1903 or 1902 -- you'd use one date while I used another.) 

I'm also somewhat hesitant about using specific years to refer to a
language at a particular stage of development. On the one hand, being
able to specify "French as of 1794" is not something for which there is
broad user need; such use by a philologist might be better suited to the
extensions mechanism that has been proposed. Also, languages don't
change so quickly as to make a distinction between (say) fr-1792 and
fr-1793 all that meaningful; and we end up with an issue of consistency
in tagging (some of the data uses 1792, some 1793...) Moreover, it is
often the case that languages at earlier stages of development are less
internally unified than at later stages because there is less history of
things that promote standardization.

The gcr-700BCE is a very good example here, since it does not
unambiguously identify Greek as reflected in the Iliad; the fact is that
when you look at the Greek language at that point in history, there are
several fairly distinct varieties (Attic, Ionian, Dorian...) Because
there was not yet a common body of developed literature or a common
civil organization, a common language variety had not yet developed. The
"common" (koine) Greek period comes several hundred years later, after
Alexander. So, since it seems rather odd that we'd go out of our way to
pin down so narrowly a point in history when we don't also pin down very
narrowly the particular community whose variety we are referring to.

When talking about orthographic conventions, I think it can make good
sense to use a specific year to refer to those conventions in terms of
the year during or around which they originated. When talking about
language varieties, though, what makes more sense to me is to talk in
terms of time spans that reflect distinct stages of development in the
lifetime of a language. Something like this is already done in ISO 639,
which has identifiers for modern vs. middle etc English. It's because
the time span is broad that the IDs are useful to a broader range of
users, and interchange issues are fewer (you won't use fr-1793 while I
use fr-1794).

[speaking of time, it's time to quit for the night]

Peter Constable

More information about the Ietf-languages mailing list