COMPARING ONLINE TRANSLATORS.

Ethan Shen has done a research project comparing the three major free translation engines available online; here are his results to date:

This paper evaluates the relative quality of three popular online translation tools: Google Translate, Bing (Microsoft) Translator, and Yahoo Babelfish. The results published below are based on a 6 week survey open to the general internet population which allowed survey takers to choose any language, enter any free-form text, and vote on the best of all translation results side-by-side (www.gabble-on.com/compare-translators). The final data reveals that while Google Translate is widely preferred when translating long passages, Microsoft Bing Translator and Yahoo Babelfish often produce better translations for phrases below 140 characters. Also, in general Babelfish performs well in East Asian Languages such as Chinese and Korean and Bing Translator performs well in Spanish, German, and Italian.

Below Figure 1, showing the comparisons in detail, come some interesting results like this:

The extent of Google’s lead varies dramatically from language to language. In some languages such as French, the strength of Google Translate’s engine is overwhelming. However, in several others like German, Italian, and Portuguese, Google holds only a very slim lead when compared to its biggest competitors….
One possible explanation is that large additional bodies of parallel English-French text are available from the government of Canada for which are official documents are translated into both.

Interesting stuff, and I’ll have to give Bing a try.

Comments

  1. The survey is buggy. Often two or more translations are identical, but you’re required to choose one. And when you don’t select one for “second best”, an error message says “Please select one worst translation engine”.

  2. Bathrobe says

    There are plenty of factors to consider here. For instance, those who use Babelfish or Bing may be “less serious” translators — those who don’t need heavy-duty translation software and just want to translate short passages for chatting, etc. This is likely to be accentuated by the fact that Babelfish, for instance, only allows you to paste 150 characters — otherwise you have to input a URL.
    As for the apparent superiority of Google for English to Chinese, I can only assume that this is because the others are so atrocious. As I’ve pointed out previously, Google isn’t anything to write home about as English-Chinese translation software.

  3. Plus the survey’s totally subjective apparently, so saying definitively that “Bing Translator and Yahoo Babelfish often produce better translations for phrases below 140 characters” is a bit of a leap. But it’s worth checking out the Bling.

  4. I’m not impressed by their methods section. If the identity of the tools weren’t blinded the study is largely useless. Though to be fair, if Microsoft can manage to gain a lead under unblinded conditions, they really must be doing a good job.

  5. It’s common knowledge by now that Google Translate is seriously impressive. Although, you’re warmly invited to check this out:
    http://smuggledwords.wordpress.com/2010/04/15/machine-translation-the-baton-of-italian-fan-and-the-ready-era/
    Now, I certainly didn’t approach the comparison as seriously, it’s not a survey, I just tested these three translators (and a few more) with an ambiguous sentence. Frankly, though, I have to see that Google Translate outperforms Bing Translator, and BabelFish just seems awful to me.
    More tests coming, anyway…

  6. Bathrobe says

    I don’t know how the Google Translate algorithm works exactly, but it seems to be based on a huge corpus of equivalent words and phrases that it can draw on to create natural-sounding translations. This is why Google Translate gathers such praise for breaking away from the old idea that computer translation first has to analyse the syntactic structure of the source, and then insert vocabulary items as necessary.
    My feeling is that Google Translate’s approach will only works well in the case of Western languages — languages where semantics, phraseology, and sentence structure (including word order) are relatively close to English. The relative similarity of sentences in most Western languages means that Google Translate largely gets it right, often stunningly right when compared with older translation approaches.
    But the problems with the new approach appear very clearly in “exotic” languages that don’t have such a close fit with Western European norms, languages like Chinese and Japanese. Google Translate’s famous ability to match expressions and phrases between languages still works, but the different sentence structures hit it for six. I think this is why Google Translate’s translations into Chinese are often so good at the level of the phrase and yet so horrendously poor at the level of the sentence. Anything beyond a simple SVO sentence, and Google Translate often mixes elements around in a bewildering manner. Elements that should be near the start of complex or compound sentences are quite literally thrown to the very end. ‘And’ conjunctions, which can join elements ranging from verbs, to nouns, to whole sentences in English, are just carried over into Chinese, with predictable results in terms of intelligibility. Sentences with figures and numbers are often so mashed around that they become completely unreliable. Correcting this requires an intense amount of time and effort.
    So ironically it is Google Translate’s revolutionary new approach that makes it so bad in translating to and from languages like Chinese and Japanese. The algorithm is running blind, pretty much oblivious to how sentences are structured, throwing nicely translated phrases into the mix in almost random fashion. Until Google Translate can figure out a way to parse sentences properly in the old unrevolutionary way, I suspect it will continue to do a poor job with languages outside the Western European mould.

Speak Your Mind

*