gender voice variants
Yury Tarasievich
yury.tarasievich at gmail.com
Fri Dec 21 07:41:47 CET 2012
On 12/21/2012 02:02 AM, John Cowan wrote:
> Michael Everson scripsit:
>
>>> It has already been shown that [the sex of speakers and listeners]
>>> does affect [the choice of words, grammatical forms, etc.] in all
>>> languages, though the degree varies. I believe it's therefore
>>> appropriate to encode it within, rather than just alongside, the
>>> language variety system.
>>
>> Why, exactly?
>
> Because I believe it to be similar in character to the things we already
> encode as language variants that do not affect intelligibility that
> much, but are important to distinguish in some cases. Looking over
> the subtag registry, I find the distinctions represented there are:
> like the speaker's or writer's point of origin, the period of use,
> the writer's spelling conventions, and the use of unusual terminology.
To clarify things (to myself, too): I understand
now that Peter and Karen actually want this
subtag to serve as a hint to a "grammar engine".
So that sentence meaning, e.g.,
"Welcome, A, I am machine B"
might be tagged like, e.g.,
"<language=lang0(e.g.,ru_1956acad)>Welcome,
<recipient=genus2(e.g.,masc)>A</>,
<originator=genus1(e.g.,fem)>I am</> machine
<originator=genus1(e.g.,fem)>B</>"
and sort of post-processed before being
presented to recipient.
In Russian, that'd mean using fem. genus
pronouns in sing. 3rd person, and changing
subordinated adjectives and verbs appropriately
in sing. past tense. Of course, that'd require
pre-processing -- tagging the parts of sentence
which would have to be changed in such manner
(like shown above).
Obviously, such selection of grammatical forms
makes sense only inside a certain grammar
codification. There'd be some minor differences
in said changes if processing by the rules of
pre-1918 Russian grammar. Same with formal style
of addressing (in modern Russian it essentially
means switching to plural).
From the discussion I gathered that "welcome"
in mentioned sentence also might have to be
changed (pre-tagged) in order to accomodate for
a <originator><recipient> combination (in
Italian). It, too, has to be pre-tagged.
All this presupposes also that the translation
will mostly keep the original sentence
structure. And that is lost quite frequently
even when translating UI to, say, Russian (or
you get weird Russian). That might happen in
_other_ phrases when translating to _other_
languages.
All because the grammar is an expression for
meaning (semantics), and you can't tag
semantics, only its original expression in,
well, English.
So, in the end of the day you either end up with
4 (2 etc.) pre-modified copies for each
changeable phrase for each translation language,
or you get to write fairly complicated
language/grammar engines (talkers) at least one
for each family of languages.
What I want to say, things will not work out as
smoothly as Peter and Karen expect (?) them to.
But the subtag itself, intended as a grammar
hint won't harm.
-Yury
More information about the Ietf-languages
mailing list