Spanglish

John Cowan cowan at ccil.org
Sun Dec 18 19:47:24 CET 2016


On Sun, Dec 18, 2016 at 8:07 AM, Luc Pardon <lucp at skopos.be> wrote:

    BCP47 violates that rule big time, by packing all kind of things
> (script, orthography, ...) in a field that was originally intended (in
> HTML) to contain only the language.
>

In fact, RFC 1766, the original incarnation of what is now BCP 47,
predates HTML and HTTP.  It was designed to have something
standard to put in the email header "Content-Language", whch was
designed to specify what language the email (as a whole) was
written in, and if it was being sent simultaneously in multiple
languages as a mixed/alternative email, to distinguish which
translation was which.  And even at that time it was understood
that language alone was too coarse-grained a category.

   Applied to the topic under discussion, "Alice in Spanglishland" would
> have to be tagged with "en" at the top of the document, and the Spanish
> words inside the text would have to be marked up separately (i.e. "<span
> lang="es">caramba</span> in HTML syntax). Or the other way around, if
> the majority of the words are Spanish.
>

Yes, that works well for vocabulary mixing simpliciter, but not so much
for more intimate language blends.  Consider the following bit of dog Latin:

Patres conscripti took a boat, and went to Philippi;
Boatum est upsettum, magno cum grandine venti.
Omnes drownderunt qui swim away non potuerunt.

The lovely word _drownderunt_ has an English root, an English inflection
_ed_ that has merged with the root (in regional English people say _drownd_
for _drown_ and either _drowned_ or _drownded_ for _drowned_), and a Latin
inflection _erunt_ on top of that.  How are we to mark this up? Similarly,
see the passages in my blog post "French in all its purity" at
<http://recycledknowledge.blogspot.com/2005/06/french-in-all-its-purity.html
>.
Is the verb _bruncher_ in the code-switching example another like
_drownderunt_, or is it pure French?

Worse yet, what of

What is this that roareth thus?
Can it be a Motor Bus?
Yes, the smell and hideous hum
Indicat Motorem Bum!

where English _bus_, itself a clipping of Latin _omnibus_ 'for all', is
treated
as a pseudo-Latin stem _b_ with a nominative ending, and then used in
the accusative as _bum_?  Neither spelling checkers nor text-to-speech
will recognize _b_ as English or _um_ as Latin.

(Pace Mark Davis, _downloaden_ is now clearly a German word, just like
_Standard, Tipp, Stopp, Rekord_.)

-- 
John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
        Is it not written, "That which is written, is written"?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/ietf-languages/attachments/20161218/d99e64d9/attachment-0001.html>


More information about the Ietf-languages mailing list