Don Osborn dzo at
Tue Dec 20 13:45:13 CET 2016

To complicate matters further, something like Alice in Spanglishland, as a partial translation of the original English (?), would indeed seem to be a transformation. However the phenomenon of language blends and code-switching more generally would not be transformation.  


Elsewhere in the thread it was suggested that the number of such blends was limited, but then a longish list of them was noted. In fact, the realm of possible blends is huge. I recall hearing a blend of Fulfulde and Bambara among young NGO employees in Douentza, Mali some years back – while no one has to my knowledge written in such a blend, one could. And to the extent that such performance, if you will, is transcribed (perhaps from a recording of a focus group,* or translation into a blend of a short script for radio performance), you would have ff & bm. As in Douentza, so potentially in many other contact locations around the world.


Elsewhere in the thread (sorry for not having refs), it was noted that for detailed markup, automation might make the task easier. With advances in language technology, what would be the possibility that – at the risk of heresy – tagging of blended language text could be handled entirely by machine (perhaps cued by a tag in the header alerting the program to the languages to anticipate, since some isolated words or short phrases might be present in more than one language that the program could potentially identify).




* I’ve been strongly advocating such literal transcription – otherwise standard practice in qualitative research – for recordings of focus groups in Africa. The unfortunately common resort to “transterpretation” (someone listening to the recording in one language or combinations of languages, and writing down their interpretation in a single/other language like French or English) inevitably means loss of data as well as filtering through the transterpreter’s understanding. From the point of view of tagging, I would be interested in how to prepare mixed language transcriptions for text analysis.



From: Ietf-languages [mailto:ietf-languages-bounces at] On Behalf Of John Cowan
Sent: Tuesday, December 20, 2016 6:35 AM
To: Mark Davis ☕️ <mark at>
Cc: ietflang IETF Languages Discussion <ietf-languages at>
Subject: Re: Spanglish



On Tue, Dec 20, 2016 at 2:36 AM, Mark Davis ☕️ <mark at <mailto:mark at> > wrote:


You are being mislead by the the term "Transformed Content" in the title of the RFC. As the abstract makes clear, the scope has always been broader than simply a transform.


I don't see that the abstract says that at all.  It specifies translation, transcription, and transliterations as types of transformations, and implies that there may be other types.  There is no hint of a semantic extension to things beyond transformations.


The idea of -t- is that what comes before the -t- is the language of the content as we have it, and what follows is the language of some other content, called the source, from which this present content is somehow derived.  Indeed, your ticket emphasizes the phrase "influenced by the source."   But there is no source here, no all-English or all-Spanish original from which the poems I quoted were made.   Using -t- to represent an interlanguage text is tag abuse.


It's true that a machine translator like Google's that leaves words alone that it does not understand (except for transliterating them) might produce a text that looked vaguely like these poems, and such partially translated output might be justly tagged en-t-es-m0-google or es-t-en-etc.  But that case is not this case.


I have added a comment to the ticket expressing this view more tersely.



John Cowan        cowan at <mailto:cowan at> 

Winter:  MIT, / Keio, INRIA, / Issue lots of Drafts.

So much more to understand! / Might simplicity return?

                (A "tanka", or extended haiku)


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Ietf-languages mailing list