I approve the registration of de-DE-1901 (German, German variant, traditional orthography)

Fri, 26 Apr 2002 13:56:24 -0500

On 04/26/2002 12:56:30 PM Michael Everson wrote:

>As I saw it, Peter, it was clear that the temporal element was
>secondary to the identification of the language itself.

>It also seemed that HTTP content negotiation, and the
>well-established de-AT convention outweighed the other argument. Do
>people want to tag the dialect first or the orthograpy?

But this gets to part of the point I'm trying to deal with. You're
comparing "language" or "dialect" ID with orthography ID. If in general it
were clearly that, then I can see that orthography is derivative. But if we
look at localisation of content in the most general case, I'm not sure that
it is always true that we are identifying dialects.

At an ISO/TC 37/SC 2/WG 1 meeting last August, someone representing a
different ISO commiittee requested tags for "ISO English" and "ISO French".
These are not dialects. Rather, they are labels that reflect some
constraints on expression considered appropriate for use in a particular
domain. That is also what happens in general in localisation scenarios:
constraints are imposed on expression (typically in terms of vocabulary)
that are considered appropriate for use in a particular domain. If you're
talking about localising German content for Switzerland or Austria, it may
well be attributable to dialect distinctions. But when we generalise, that
is not necessarily the case.

Furthermore, we also find (as I explained in earlier messages) that when we
impose such constraints on expression in terms of vocabulary or whatever,
then we nearly always, if not always, also make particular orthographic
choices. This is true whether the constraints were motivated by regional
dialects, or by something other than dialect such as organisational
preferences.

These two things together are what led me to conclude that we can
generalise these issues in terms of a single notion handled in a uniform
way. In my paper, I came up with the temporary name "domain-specific data
set", and because of the inference in relation to orthography made it a
derivative notion from orthography.

So, whereas you're assuming the comparison is between orthography and
language/dialect, I am not. And the reason I am not is because I came to
think that we can find a generalisation that handles a variety of our needs
in a single, uniform way, and handle them adequately. In the course of this
thread, I have seen reasons why what people may want to say about data that
they have or that they are seeking specifies domain-usage-related
constraints such as vocabulary but not orthography, and that necessitates
some revision to some of what I proposed, but it doesn't eliminate the
possible benefits from the potential generalisation I described above.

>I've approved them. If you protest, we can  put them on hold easily
>enough. Do you want to? I'm only the reviewer. What's been approved
>is suitable, in my view, but we can hold off for further discussion
>if you wish. But not forever.

I agree it can't go on for long; I just though it was closed 12-24 hours
too soon. Do I want it put on hold? I'll let others respond. I've made the
case I've wanted to make, and I got helpful input on some possible usage
needs I hadn't considered. I've seen a response at least from Johannes that
acknowledges the validity of points I'm trying to make, though he also
wants to stick with the tags as originally proposed (feeling that the
changes that are needed are going to require a new mechanism -- I was
attempting to pursue getting broader and longer-term needs met within the
limits of the existing identifier syntax first).

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <peter_constable@sil.org>