gender voice variants

Peter Constable petercon at
Sun Dec 23 21:03:12 CET 2012

Valid point.


From: mark.edward.davis at [mailto:mark.edward.davis at] On Behalf Of Mark Davis ?
Sent: Saturday, December 22, 2012 6:11 PM
To: Peter Constable
Cc: ietflang IETF Languages Discussion; Michael Everson; Alain LaBonté
Subject: Re: gender voice variants

You may have missed a message I wrote on these threads. If apps are all monolithic, then I'd agree with you. For us they are not. You can be forming an app out of separate components. These components are used with a variety of apps. The main app would need to communicate the level of desired formality to each of the modules it uses. Otherwise you end up with a ransom-note effect.

Here's what I wrote (Dec 19):

"We actually have had a use case for formal vs informal.

Some (online) products are meant to be more 'social', and thus informal (du), while some products are meant to be more 'business', thus formal (Sie). The problem comes in when you have shared online components. You really don't want to mix du and Sie on the same page, addressing the same user.

When the component is used within a social product, the product should select the locale "de-informal"* for the component, while when used in a business product, the component should use the locale "de-formal"*."


— Il meglio è l’inimico del bene —

On Sat, Dec 22, 2012 at 8:54 AM, Peter Constable <petercon at<mailto:petercon at>> wrote:
From: Michael Everson
Sent: ‎December‎ ‎21‎, ‎2012 ‎3‎:‎05‎ ‎AM
To: ietflang IETF Languages Discussion
> Musing.
> A language tag applied to a run of text tells the any person or
> process "This text is in the English language" and a subtag might
> make precise for instance that "This English text is in Oxford spelling".

> A voice tag applied to a run of text tells a computer "Read this
> text aloud in a woman's voice". A voice tag does not change the
> content of any text being read out: The voice will read text from
> the New York Times, or a Help dialogue box equally. A voice tag
> selects a voice only.

In this scenario, the voice indication would not, I think, be part of the language tag on the text since it is not an attribute of the text. Rather it’s an independent request for certain processing.

However, the text-to-speech resource could potentially be tagged with a language tag that includes a female voice variant subtag: it is an attribute of that resource. Conceptually, one might argue that the voice is not a linguistic characteristic since a woman’s voice could be used to narrate text that uses male-speaker linguistic constructs, but I don't know a likely a scenario that would be.

> An audience tag will tell a process "Choose a set of localized
> strings which address me as a male or as a female".
> Some other tag whose name I can't think of will tell a process
> "Choose a set of localized strings which make it look as though
> you are talking to me as if you were a man or a woman".
The sex of the source and audience can, as we've seen, be independent factors. The sex of the TTS voice is conceptually distinct from the sex of the source, but it's not clear to me how much of a real need there would be to distinguish those.

> A manners tag will tell a process "Use a set of localized strings
> which use a formal or informal register".

Potentially, though it’s not clear to me that an app would support such tailoring. It seems more likely that an app developer would decide that they want to adopt a particular level of formality for their app. However, a formality tag could still be useful in maintaining string resources, e.g., in a translation memory.


Ietf-languages mailing list
Ietf-languages at<mailto:Ietf-languages at>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Ietf-languages mailing list