sl-rozaj: language-dialect-subdialect strings
han.steenwijk at unipd.it
han.steenwijk at unipd.it
Wed Oct 15 12:56:43 CEST 2003
John Clews <Scripts2 at sesame.demon.co.uk> scripsit:
> 1. Here's what exists:
> Here's what slightly worries me:
> Han Steenwijk wrote in his email:
> On the level of the spoken language, there does not exist a
> linguistic entity that is solely and exhaustively
> indentifiable as *sl-rozaj*.
> In that case, should it have been registered as such?
> See also my comments on 3. below.
I think the tag "sl-rozaj" still has its right of existence.
Firstly, some, especially older texts cannot be identified as for the
sub-dialect they represent because of the poor transcription used or because of
lack of meta-textual information.
Secondly, although as a complete linguistic system no "linguistic entity that is
solely and exhaustively indentifiable as *sl-rozaj*" exists, the sub-dialects
share a lot of common features that are aptly termed common Resian features, i.
e. sl-rozaj. Indeed, in many scholarly works general reference is made to Resian
and/or Resian material is quoted without indicating the specific sub-dialect.
That brings me to my third application, which I forgot to mention earlier: a TEI
corpus of secondary literature on Resian.
> I'd agree with both Han and Peter that strings other than 4-letters
> be used (as ISO 15924 strings are likely to be 4-character strings).
Are we on the safe side with alpha5 tags? I once read that the ISO 639 committee
is discussing the introduction of such tags. Would such a type of tag be used
only in the first position of a tag string?
> 3. Instead, why not have the following strings?
> Tag to be registered : sl-bisk...
> Tag to be registered : sl-njiv...
> Tag to be registered : sl-osoj...
> Tag to be registered : sl-solb...
> Tag to be registered : sl-lipa...
> Tag to be registered : sl-rava...
> Tag to be registered : sl-ucja...
> (Again, I haven't changed the 4-character subtag suggestions).
> Reason: As the entities above represent dialects of Resian (a dialect
> of Slovenian), they also represent dialects of Slovenian.
> RFC 3066 and its registrations haven't had strings for subdialects
> before, only for dialects (whether or not they can also be regarded
> as sub-dialects).
> I'd prefer to see no extension of a new "method"
> (language-dialect-subdialect) to RFC 3066 without further discussion.
The idea of subtags is that one can default to a (sub)tag that occurs on the
immediate left of a subtag that is not understood by the application. Having the
tag string "sl-rozaj-something" you get two possibilities for fall-back. The
sub-dialects have enough in common to have a fall-back on "-rozaj-" making
sense. Falling back on "sl-" is far less unproblematic, because of the
considerable difficulties in mutual understanding the speakers of standard
Slovene and Resian have. They really have to learn each other's languages.
Indeed, at one point I was tempted to file requests like "sl-rozaj-osoj-ucja" in
order to make the genetical relationship between the sub-dialects explicit. But
I guess that is not what RFC 3066 subtags are for.
> 4. Does "sl-rozaj-1994" actually need to be registered? I maintain
> that it doesn't, at least at this stage. As the most standardized
> (sub-)dialect) it's likely to be the default usage on web-pages etc.
> for Resian.
> The default use of eng, deu, etc will use the standardised
> orthography as default, and the same should also apply to sl-rozaj.
The problem arises from the two-fold interpretation of primary tags: standard or
some sort of sub-variety which one is unable or unwilling to identify. The same
mechanism can be applied to "sl-rozaj", if it will become subtagged any further.
If "sl-rozaj" is to be interpreted as standard Resian, which is alright with me,
I run into two problems:
1) how can one distinguish standard Resian texts from unidentifiable dialect
texts? By requesting a specific tag for the latter case, like
2) Apart from supra-dialectal standard Resian, normalised orthographies exist
for the four main sub-dialects. How to distinguish these from sub-dialectal
texts that are not written according to this normalisation? Originally, I
planned to file four requests like "sl-rozaj-something-1994" at the appropriate
> 5. I wonder, given the documentation on Resian, whether Resian should
> be listed as a language, and have a single code (also in ISO 639-2).
Sometimes I wonder, too. Maybe ISO 639-2 is a bit too much, as there is little
specialised literature in Resian, but the ISO 639-3 criteria can be satisfied.
Prof. Han Steenwijk
Universita di Padova
Dipartimento di Lingue e Letterature Anglo-Germaniche e Slave
Sezione di Slavistica
Via Beldomandi, 1
e-mail: han.steenwijk at unipd.it
tel.: (39) 049 8278669
fax: (39) 049 8278679
This mail sent through IMP: webmail.unipd.it
More information about the Ietf-languages