Mark Davis ☕️
mark at macchiato.com
Sun Dec 18 16:53:24 CET 2016
On Sun, Dec 18, 2016 at 4:31 PM, Don Osborn <dzo at bisharat.net> wrote:
> Would it be possible to use the lang tag "mul" at the head of a document,
> and then tag specific text strings with relevant language tags?
It is certainly possible to put mul at the head, and tag everything
specifically. (In practice, that is no different than putting in "und".)
Alternatively, you can put the most common language in head, and tag
everything else specifically.
For code-switching like Spanglish or Hinglish, tagging every instance of a
different language since it is often done in the same sentence is so very
tiresome, however, that it is essentially never done. And having the
expectation that it be done, except in tightly controlled circumstances,
well, get used to disappointment.
And that doesn't account for mixed words, as you point out. I've definitely
seen and heard "*downgeloadet*" in German, which is might have originated
in English, but clearly shouldn't be tagged as English. (
http://www.duden.de/rechtschreibung/downloaden, with another example* "hast
du das neue Update schon downgeloadet?"*)
That's why I think following up on Michael Everson's suggestion is a better
one, to have a mechanism for tagging a document (or chunk) of code-switched
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Ietf-languages