On 04/24/2002 09:23:05 AM Martin Duerst wrote:

>I have to say that what I know about the linguistic
>situation (which I guess is more than either Michael
>or Peter) doesn't suggest any particular preference
>for either having the year or the country first, and
>because things such as de-DE,... are already firmly
>established, I would strongly prefer to stay with the
>original proposal of having the country in the position
>that is familiar to a lot of people. So I would indeed
>say NO to Peter's proposal.

My suggestion of year before country is based on the assumption that the
year is intended to distinguish orthographies whereas the country is
intended to distinguish primarily vocabulary. It also was assuming the
model I proposed in my paper, in which vocabulary choices are assumed to
imply orthography choices. While that particular implication may not be
valid in general, I think it is still the case that orthographic
distinctions are more fundamental in that a particular orthography will
always determine a particular writing system, whereas vocabulary choices
may not. Thus, I think there may still be a case in this situation for
putting year before country.

The fact that tags such as de-DE are firmly established is irrelevant, I
think, since what is being proposed will also include de-1901, and since
you're claiming that the year and country are, in this case, independent.
Given that the new set will include de-1901, I don't see how the prior
existense of de-DE implies a preference for de-DE-1901 rather than

On the other hand, if it is generally true that orthography distinctions
are more fundamental and of greater concern than are vocabulary choices
(I'm talking in the general case, across all languages, not just the case
of German; I'm also presenting this as a proposal to be evaluated) then
that can imply a preference for de-1901-DE over de-DE-1901.

Thinking in terms of my IUC21 paper, these issues have to do with the role
of sub-language linguistic variants, with a category type that I propose in
my paper and which I (tentatively) label "domain-specific data sets), and
with the relationship between the two. How these relate to one another is
something I didn't make conclusions on in my paper, and it's not something
that's at all obvious what the right conclusion should be, I think.

