Machine Translation

"Martin J. Dürst" duerst at it.aoyama.ac.jp
Thu Sep 10 04:07:17 CEST 2009


Hello Peter,

On 2009/09/10 2:41, Peter Constable wrote:
> A generic tag ("machxlat"?) doesn't seem like a terrible idea.

Sorry, but I have difficulties understanding "doesn't seem ... 
terrible". Is that a positive or a negative judgement?

> But it's also not clear to me how it would be used: would it only be reported to users in some UI, or would other automated processes be used on tags containing this subtag?

As always, tags just give information. How this information is being 
used is up to the consumers.

> Is MT important to distinguish from native speakers with bad (non-conventional) spelling and grammar, or from 2nd-language speakers with bad spelling and grammar, or even from non-standard dialects (which may differ considerably from standard usage)?

It seems that the main reason for this proposal is to distinguish 
between ANY human-written text and machine-translated text. Spelling 
checkers are a lot easier to create than translation engines, and have 
much higher quality. There's definitely some content out on the Web with 
bad grammar, bad spelling, and so on, but it's not that bad, and to some 
extent, it can be just taken as part of human linguistic reality. 
Machine translation on the other hand has nothing to do with descriptive 
linguistics.

Regards,    Martin.

>
> Peter
>
> From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of Debbie Garside
> Sent: Wednesday, September 09, 2009 6:25 AM
> To: 'ietflang IETF Languages Discussion'
> Subject: Machine Translation
>
> Hi
>
> The following is part of a conversation I have been having with a couple of colleagues and I was wondering if anyone had any ideas on whether a generic tag could be registered for machine translated text?  In the past we have steered away from generic tags (such as western).
>
> ****
> However, we are concerned that a lot of MT produced Welsh could appear
> on the web. Google's translation into Welsh isn't perfect by a long
> shot. Previous so-called attempts have been a lot worse but have been used :
>
>      http://www.flickr.com/photos/benbore/240597433/ (just one of many)
>
> and blogs MT'ed into Welsh outnumber those originally written in Welsh
> when searching in Google.
>
>      e.g. http://www.google.com/search?q=chyfieitha+dudalen&ie=utf-8&oe=utf-8
>
> This would not be great news. We hope with this development that some
> can be educated to use such a service responsibly :
>
>      http://murmur.bangor.ac.uk/?p=99
>
> However, at the very least, this could frustrate our (and others) work
> and efforts e.g. collecting original Welsh texts from the web as a corpus.
>
> An idea we had, if this does not already exists for other languages
> (though languages supported by MT to date have been 'larger' and more
> robust), was whether ISO 639 could be used in the future to produce
> codes (or extensions) for tagging text/language as being from an MT
> system. Hopefully the provision of codes or meta data could facilitate
> MT providers to implement these so that such texts can be excluded in
> certain applications. (including search engines!)
>
> And further....
>
> ****
>
> However, in our case we would welcome a further distinction of MT for Welsh. Clearly there must be a distinction  made between 'MT' from InterTrans and original/proper Welsh.
>
> But we might want to one day want to distinguish even between MT providers - cy-mt-intertrans, cy-mt-google and cy-mt-apertium. Some might be more reliable than the others.
>
> Your thoughts would be appreciated.
>
> Best regards
>
> Debbie
>
> Debbie Garside
>
>
>
>
> Internal Virus Database is out-of-date.
> Checked by AVG.
> Version: 7.5.560 / Virus Database: 270.12.26/2116 - Release Date: 15/05/2009 06:16
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst at it.aoyama.ac.jp


More information about the Ietf-languages mailing list