sl-rozaj: language-dialect-subdialect strings

Wed Oct 15 12:56:43 CEST 2003

John Clews <Scripts2 at sesame.demon.co.uk> scripsit:

> 1. Here's what exists:
> 
>         sl-rozaj
> 
> Here's what slightly worries me:
> 
> Han Steenwijk wrote in his email:
> 
>         On the level of the spoken language, there does not exist a
>         linguistic entity that is solely and exhaustively
>         indentifiable as *sl-rozaj*.
> 
> In that case, should it have been registered as such?
> See also my comments on 3. below.

I think the tag "sl-rozaj" still has its right of existence. 
Firstly, some, especially older texts cannot be identified as for the 
sub-dialect they represent because of the poor transcription used or because of 
lack of meta-textual information.
Secondly, although as a complete linguistic system no "linguistic entity that is 
solely and exhaustively indentifiable as *sl-rozaj*" exists, the sub-dialects 
share a lot of common features that are aptly termed common Resian features, i.
e. sl-rozaj. Indeed, in many scholarly works general reference is made to Resian 
and/or Resian material is quoted without indicating the specific sub-dialect. 
That brings me to my third application, which I forgot to mention earlier: a TEI 
corpus of secondary literature on Resian.

> 2.
> I'd agree with both Han and Peter that strings other than 4-letters
> be used (as ISO 15924 strings are likely to be 4-character strings).

Are we on the safe side with alpha5 tags? I once read that the ISO 639 committee 
is discussing the introduction of such tags. Would such a type of tag be used 
only in the first position of a tag string?

> 3. Instead, why not have the following strings?
> 
> Tag to be registered       : sl-bisk...
> Tag to be registered       : sl-njiv...
> Tag to be registered       : sl-osoj...
> Tag to be registered       : sl-solb...
> Tag to be registered       : sl-lipa...
> Tag to be registered       : sl-rava...
> Tag to be registered       : sl-ucja...
> 
> (Again, I haven't changed the 4-character subtag suggestions).
> 
> Reason: As the entities above represent dialects of Resian (a dialect
> of Slovenian), they also represent dialects of Slovenian.
> 
> RFC 3066 and its registrations haven't had strings for subdialects
> before, only for dialects (whether or not they can also be regarded
> as sub-dialects).
> 
> I'd prefer to see no extension of a new "method"
> (language-dialect-subdialect) to RFC 3066 without further discussion.

The idea of subtags is that one can default to a (sub)tag that occurs on the 
immediate left of a subtag that is not understood by the application. Having the 
tag string "sl-rozaj-something" you get two possibilities for fall-back. The 
sub-dialects have enough in common to have a fall-back on "-rozaj-" making 
sense. Falling back on "sl-" is far less unproblematic, because of the 
considerable difficulties in mutual understanding the speakers of standard 
Slovene and Resian have. They really have to learn each other's languages.
Indeed, at one point I was tempted to file requests like "sl-rozaj-osoj-ucja" in 
order to make the genetical relationship between the sub-dialects explicit. But 
I guess that is not what RFC 3066 subtags are for.

> 4. Does "sl-rozaj-1994" actually need to be registered? I maintain
> that it doesn't, at least at this stage. As the most standardized
> (sub-)dialect) it's likely to be the default usage on web-pages etc.
> for Resian.
> 
> The default use of eng, deu, etc will use the standardised
> orthography as default, and the same should also apply to sl-rozaj.

The problem arises from the two-fold interpretation of primary tags: standard or 
some sort of sub-variety which one is unable or unwilling to identify. The same 
mechanism can be applied to "sl-rozaj", if it will become subtagged any further.
If "sl-rozaj" is to be interpreted as standard Resian, which is alright with me, 
I run into two problems:
1) how can one distinguish standard Resian texts from unidentifiable dialect 
texts? By requesting a specific tag for the latter case, like 
"sl-rozaj-dialect"?
2) Apart from supra-dialectal standard Resian, normalised orthographies exist 
for the four main sub-dialects. How to distinguish these from sub-dialectal 
texts that are not written according to this normalisation? Originally, I 
planned to file four requests like "sl-rozaj-something-1994" at the appropriate 
moment.

> 5. I wonder, given the documentation on Resian, whether Resian should
> be listed as a language, and have a single code (also in ISO 639-2).

Sometimes I wonder, too. Maybe ISO 639-2 is a bit too much, as there is little 
specialised literature in Resian, but the ISO 639-3 criteria can be satisfied. 

=================
Prof. Han Steenwijk
Universita di Padova
Dipartimento di Lingue e Letterature Anglo-Germaniche e Slave
Sezione di Slavistica
Via Beldomandi, 1
I-35139 Padova

e-mail: han.steenwijk at unipd.it
tel.: (39) 049 8278669
fax:  (39) 049 8278679

-------------------------------------------------
This mail sent through IMP: webmail.unipd.it