Request: Language Code "de-DE-1996"
Tue, 23 Apr 2002 13:39:25 -0500
On 04/23/2002 10:30:43 AM Michael Everson wrote:
>de-1901 would be undifferentiated for country.
Then the question I have is just what kind of object is this referencing,
and what use is this ID? If you're saying that it's identifying a
particular orthography, and that there are no orthographic differences
between the different countries, then that makes me ask what kind of object
de-DE-xx is intended to denote? It can't be making an orthography
distinction if there are no orthographic differences between the various
countries. And if there *are* orthographic differences, then of what use is
de-1901? It would be a new kind of notion -- a collection of (related and
similar) orthographies. But how is such a notion really useful? You can't
use it to pick out a spelling checker. I suppose it could be used in
retrieving data if someone was looking for data in any of these related but
distinct orthographies, but that seems like too much of an edge case.
This highlights one of the reasons for the model I propose: we have had a
practice of suggesting tags without making clear what *kind* of object the
tag is intended to identify, and this has the potential to leave us with
tags for which it is unclear what they are distinguishing and when and how
they should be used.
Are there orthographic differences between the various countries or not?
If not, then country codes shouldn't be incorporated into tags that are
intended to distinguish orthographies but nothing more. If someone needs
tags to identify sets of data localised for particular countries (for
purposes other than orthography -- e.g. content, lexica), then that is
appealing to a notion that is more specific than orthography (such a set of
data will be in some single orthography), and the country code should be
added as a qualifier to orthography IDs: e.g. de-1901-DE.
If there *are* orthographic differences between the various countries, then
it's fairly clear what kind of object and what specific instance of that
kind of object something like de-DE-1901 is intended to denote: German as
spelled in Germany following conventions defined in 1901 (but not as
spelled in Germany using other conventions, and not as spelled in some
other country). But it is *not* clear what kind of object de-1901 is, let
alone the identity of the specific instance. I question the usefulness of
such ambiguous tags.
Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485