ChatGPT, Bing, Bard, or DeepL?

Karin Kaneko’s Japan Times article ChatGPT, Bing, Bard and DeepL: Which one offers the best Japanese-to-English translation? provides some detailed comparisons:

As a reporter in Japan working in both English and Japanese, I was keen to test these tools and see which might be best for my line of work, and whether they offer benefits for Japan Times readers. […] Specifically, we wanted to see which one of the AI-assisted tools best understands context in Japanese, a language in which sentences sometimes lack a subject, and accurately translates it into natural English.

Kanako Takahara, a senior bilingual Japan Times editor, compared Japanese-to-English translations from ChatGPT-4, Bing and Bard, along with DeepL, an AI-based machine translation service, using Japanese text in three different categories:

• Literature: The opening lines of “Snow Country,” authored by Nobel laureate Yasunari Kawabata
• Lyrics: Japan’s national anthem “Kimigayo”
• Speech: Japanese superstar Shohei Ohtani’s speech to his teammates just before the World Baseball Classic final against the United States in March

She scored on a scale of 1 to 5, with 5 being the highest based on accuracy, how natural the English was and whether the translation reflected the context. […]

Based on the above results, ChatGPT-4 scored the highest overall, closely followed by Bard and Bing. Translations varied for “Snow Country” with all of the tools aside from Bard unable to capture the subject of the story, suggesting that the translation of literature may not be among their strengths.

Note that AI chatbots may produce different responses on different devices or with different timings or various other factors such as the way you phrase your instruction, which means that if users were to translate the above text, the result may differ from what The Japan Times had.

While this factor also makes it difficult to pick which of the AI-assisted tools is the best, ChatGPT-4 easily offers the highest quality, according to Tom Gally, an expert in Japanese and English at the University of Tokyo who has been experimenting with the AI language models.

Gally said that what makes the large language models superior compared to previous tools is the ability to interact with humans. Therefore, the quality of answers that generative AI tools produce depends largely on the way we phrase the questions or instructions to the AI, which is called “prompt engineering.” […]

Generative AI tools are also able to understand the context of a text and appear to remember the content of the previous sentences, whereas previous language models could only translate sentence by sentence without remembering the previous one, which can change the meaning of the original text. […]

What was clear, though, was that AI chatbots’ translations were much better than those of DeepL — presumably because of their ability to capture the context. But what was also notable was that none of the texts were translated with 100% accuracy, which means humans would need to check and make necessary edits.

Visit the link for the sample translations (with originals) and thoughts on the future of human translators. Thanks, Bathrobe!


  1. For the Ohtani speech, Bard reads more idiomatic and therefore more comprehensible, even if it’s less precise, as they say.

    For Kimigayo, there are probably many translations out there for the models to crib from.

    All the translations for Snow Country have serious enough glitches to make the models untrustworthy (“scattered coldly”, “A young girl … slid down the glass window”, “before reaching the hue of the snow, it was swallowed up by the darkness”, “had a fur hat hanging from his ears”, etc.)

  2. From my tests of technical writing and business correspondence in Japanese to English and vice versa, ChatGPT wins.

    “humans would need to check and make necessary edits”

    However, as with everything else in chatGPT, one has to read the results carefully, because unlike the other translation engines, chatGPT has that well known tendency to spew total nonsense in a blandly convincing manner.

    In my experience, errors range from using inappropriate words, which are easily corrected, to wandering off on a tangent that looks superficially very accurate but is really totally off the mark and would get you into trouble if you used it as-is.

  3. Well, Chat GPT-4 and Bing did best at looking up a human translation of the anthem. So yeah, I guess they could replace the journalist, who apparently didn’t bother to check if that was what they’d done.

    Human post-editing of machine translations is already a thing, and has been for a while, and is the reason why a lot of fantastic technical and commercial translators are being forced to change careers mid-life, and why literary translators, on the whole, are even more depressed than usual. Because post-editing is soul-killing, and of course it’s not going to save you real labor, if you’re someone who cares about quality translation and knows what that entails. The stuff that AI can do is the easy part, the preparation. It’s the first reading. But it will be a great excuse to pay people even less. And it will take away most of the non-literary jobs that allow people to make a living as translators and to develop the skills and intuition that are so important when approaching a literary text.

  4. By the way, I keep meaning to ask: why does Google no longer offer to translate hits in foreign languages? That was extremely convenient.

  5. cuchuflete says

    Google? Works as designed, just like “sin” for humans. They want you to use Chrome, and only Chrome, and add the Google translation utility.

    Of course that doesn’t help if the search results include more than one source language, or, for example, more than one variety of Spanish. Of course, if you don’t care that a melocotón has been inaptly rendered as durazno, or your ordenador has become a computadora…

  6. Bathrobe says


    Yes, I can sympathise with everything you said!

    I used to do real translation (Japanese-English) many years ago. Learnt lots of Japanese from it.

    Then, about a decade ago, I got involved in purely functional translation from English to Chinese (just for the content, not for the style). I used Google Translate, then post-edited it. I didn’t find it soul-destroying at all (I don’t mind doing that sort of thing), but it sure made me lazy! And I didn’t learn much Chinese in the process. I suspect that if I ever start translating again, I’ll end up using translation software with a bit of cleaning up…. It is kind of depressing that most of the grunt work can be done by machines. And that the apprenticeship for literary translation that grunt work involves will disappear.

    Re: Snow Country

    The first sentence of Snow Country is a celebrated one. I’ve read a pretty negative appraisal of the first sentence in Seidensticker’s translation (sorry, I can’t remember where, or by who, or in which language). From memory, Seidensticker’s translation goes something like “The train came out of the long tunnel into the snow country”. The problem is that it sounds like an observer from above looking down on a train coming out of a tunnel. That is NOT the sense conveyed by the Japanese (国境の長いトンネルを抜けると雪国であった。) The Japanese conveys the impression experienced by someone ON the train. “When the train (I was on) came out of the long tunnel, it was (=I was in) the snow country.” All of the translations struggle with this sentence:

    When the train emerged from the long tunnel at the border, it was a snowy country.

    Upon exiting the long tunnel at the national border, there was the snow country.

    After passing through a long tunnel at the border, we were in a snowy country.

    When we came out of the long tunnel and the snow country began,

    But only the first comes even close to committing Seidensticker’s error — possibly because, as Biscia suggests, it might have looked at existing translations. Otherwise, why did it put “the train” in the first sentence, which is not in the Japanese?

    (I will repeat here my disdain for Edward Seidensticker. His translations inevitably flattened the style of the original Japanese, showing a predilection for converting complex sentences into simpler, shorter, less complex sentence structures, where everything is a quiet monotonous drone. The best way to appreciate this is to read the four books in Mishima’s “Sea of Fertility” tetralogy, which were translated by three different translators/translation teams. The volume that Seidensticker translated has a jarringly different style from the other three books. He truly has his own style, which shines through whether he is translating The Tale of Genji, Kawabata, or Mishima.)

  7. If translation is not your job or your passion, GT simply is useful progress. In my job I quite frequently have to reuse material that was written in German or English in the other language, or, more rarely, put it into Russian. Before GT, I would have had to spend a lot of time on translating it myself (nobody will give you budget for handing it to a translator, except if it’s contracts or documentation needed for legal purposes). Now I can put it through GT and then edit it – something I mostly have to do anyway because the material has to be adapted for its new use. GT also helps if material is in a language which I’m not fluent in, like French – I don’t have to look up all the words I don’t know, but knowing the target language and the technical subject matter well, I’m positive that I can find any mistakes GT commits.

  8. Well, I suppose Biscia is right: they will pay less to human translators.

    @Bathrobe, I’m confused, because both your literal (?) translation and the first example begin the same way…

  9. they will pay less to human translators

    …have been paying less (or eliminating them). Right from the earliest days of Google Translate.

  10. Perhaps.
    I know someone who was translating novels for a well known series for $300 a novel in ealy 2000s in Russia. So the dynamics is complicated here:)

  11. John Cowan says

    GT wasn’t released until 2006, and at that point was probably useless for Russian.

  12. Bathrobe says

    @ drasvi

    I’m not sure of the reason for your confusion.

    Seidensticker’s translation was something like:

    The train came out of the long tunnel into the snow country

    It sounds like someone is looking down at the scene of a train coming out of a tunnel.

    When the train emerged from the long tunnel at the border, it was a snowy country.

    looks like it could even have been based on Seidensticker’s translation. Briscia suggested that ChatGP-4 might be in the business of looking up human translations. Could that be what happened here?

    The Japanese original suggests not that an observer was hovering in the air looking down at the train, but that he was a passenger on the train. So this not a distant picture of a train emerging from some tunnel. The scene presented is that of a passenger looking out the window.

  13. Bathrobe says

    The crucial parts are:

    1) the train is not the subject of the Japanese sentence, so the coming out of the tunnel isn’t necessarily focused on the experience of the train.

    2) 雪国であった — “it was the snow country”. The sentence doesn’t say the train came into the snow country. It says that, after coming through the tunnel (and the old provincial boundary line), it was the snow country, a Japanese turn of phrase that might be expressed in English as “we were in the snow country” or “we’d arrived in the snow country”.

  14. Is there any reason why “the train” could not just be “our train” instead?

  15. Bathrobe says

    @ Brett

    Of course not — or, better, “my train”. But that requires a certain subtlety of understanding and creativity in the translation system. And whether that would result in the desired “literary quality” required in translating a literary work is another question.

  16. Bathrobe, sorry: below you wrote that ChatGPT could only come up with its translation if it knows Seidensticker’s and other translations.
    But then ChatGPT’s translation is nearly identical to yours (and I thought your translation is literal).

    It sounds like someone is looking down at the scene of a train coming out of a tunnel.
    I also wanted to ask about “our train”.

  17. Bathrobe says

    below you wrote that ChatGPT could only come up with its translation if it knows Seidensticker’s and other translations

    I didn’t write that “ChatGPT could only come up with its translation if it knows Seidensticker’s and other translations”. I simply wondered aloud whether it MIGHT have looked at Seidensticker’s translation, as Briscia suggested.

    Actually, my literal translation was mistaken. It should not have been “When the train (I was on) came out of the long tunnel, it was (=I was in) the snow country.” It should have been “When (I/we/someone/something) came out of the long tunnel, it was (=I was in) the snow country.” My apologies.

Speak Your Mind