Caoimhin O Donnaile
caoimhin at smo.uhi.ac.uk
Thu Sep 10 17:46:52 CEST 2009
I told Kevin Scannell about the debate on a creating a tag for
machine-translated text and here are the comments he sent back to me
(translated by human from Irish Gaelic), in case they are of interest:
"Welsh was the first language (other than Irish) for which I
did a big crawl - working in conjunction with Andrew Hawke,
director of Geiriadur Prifysgol Cymru [the Welsh national
dictionary] in 2005. Even at that time, we saw lots of spam
which looked like it came from translation engines.
For one example see: http://andax.sourceforge.net/index.cy.html
They were so bad that I was able to write a small script to
filter the spam out of the corpus. Perhaps it would
be more difficult nowadays!
It is an interesting idea, creating a special 639 code for
text which has been machine translated, but I must admit that
I don't put my trust in any scheme which depends on people adding
metadata to documents. [...]"
I suppose the difference in this case is that the tag would be
added, not by humans but automatically by the machine translation
More information about the Ietf-languages