[Ltru] Re: Request for variant subtag fr 16th-c 17th-c Resubmitted!

Mon Dec 25 08:49:13 CET 2006

Hello Mark, others,

I'd very much like to see Peter clarify some of the points below.
But I'd also like to say that to some extent, we have been and
will be living with the fact that language tagging at some level
is a business that is a bit more fuzzy than e.g. integer arithmetic.
More below.

At 11:35 06/12/19, Mark Davis wrote:
>Peter, I'm a little bit fuzzy on where the lines were drawn with historic versions of the same language in ISO. This is relevant to the LTRU group for ISO 639-3, so am cc'ing that group.
>
>I take it from your discussion that "fr" means *only* modern French, and that if I want to have a tag for any French, modern or not, I would have to use (fr OR frm OR fro). Similarly, if I wanted any English, I would have to use (en OR enm OR ang). 
>
>My question is how this is managed over time in ISO, since there are significant implications for language tags. Let's take Czech, for example, where we only currently have 'cs'. I see the following possibilities. 
>
>1. This means only Modern Czech.
>That implies that there is no code for Old Czech, so if I want to tag something with that, I need to petition ISO for an language tag for Old Czech (let's say 'ceu'). Once that is added, I can refer to Old Czech. 
>
>2. This currently means any Czech, but ISO may introduce a code for Old Czech (let's say 'ceu'). I see three possible approaches:
>
>2a. The denotation of 'cs' is changed to mean only modern Czech. This would be a breaking change for the language subtag registry, since the meaning of a subtag would be narrowed, invalidating any tags that had a broad application. This would be rather disturbing, since we are guaranteeing stability. 
>
>2b. The denotation of 'cs' remains "any" Czech, and to get only modern Czech I would need to use (cs AND NOT ceu). Note while OR can be handled with a list, as per RFC 4647, AND NOT cannot. This, however, would not break stability. 
>
>2c. The denotation of 'cs' remains "any" Czech, ceu becomes an extlang.
>This I see as the least unpleasant outcome, but I can't tell if whether this would be the ISO policy.
>
>While I use Czech as an example, the same would be true for any similar breakdown, including English into Early Modern and Modern. 

I think there is another possibility, which is:

3. The denotation of 'cs' is changed to mean *mainly* modern Czech.
   Old material that is tagged as 'cs' but may include Old Czech
   is still valid. New tagging should distinguish between 'cs' and
   'ceu', but there may be some timelag in adapting this distinction.

Regards,    Martin.

#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst at it.aoyama.ac.jp