La Grande Rousse has an entry today on Webster’s Online Dictionary: The Rosetta Edition. This remarkable (if somewhat annoying) site scrapes up huge quantities of information about virtually any string of conjoined letters you can find on the internet (check out the list of items beginning with aa), calling them all “words” and offering definitions (often from the 1913 edition of Webster’s Unabridged), synonyms, crossword definitions, “commercial usages,” images, quotations, usage frequency (telling you how often the word is used as what part of speech, although “midwife” is supposedly used as a noun 100.00% of the time, which is clearly untrue), “Frequency of Internet Expressions,” “Modern Translations,” “Ancestral Language Translations,” “Bible Trace,” “Matched Bible Translations,” “Derivations & Misspellings,” rhymes, “Alternative Orthography” (hexadecimal, Leonardo da Vinci, ASL, semaphore, Braille, &c, even including Arthur Conan Doyle’s “dancing men”), “Bibliographic Items” (mostly media references and, and who knows what all. Much of this stuff is cute but useless; what’s of primary interest to Languagehat, of course, is the translations, and I regret to say they are not to be depended on. You’d expect problems with a multivalent word like set or bow, so I tried whale, which seemed fairly straightforward, but here is the entirety of the Bulgarian entry:

???? ?? ??? ?? ?????? [hodya na lov za kitove], ????? [shibam] (beat, cut, drive, flog, lash, scourge, slash, swinge, switch), ??? [kit] (mastic, paste), ???? ??????? [neshto ogromno] (sockdolager), ??????????? [naperdashvam] (clobber, dress down, lace, lambaste, larrup, lather, paddle, pepper, skin, thrash), ??? [biya] (bang, beat, chime, club, curry, feeze, go, hammer, hide, hit, kill, knoll, lace, lather, lay, lick, maul, palpitate, peal, pelt, pulsate, pulse, ram, ramrod, ring, rough up, shoot, strike, swingle, thrash, thresh, wallop, welt, whip, whop, zap).

There is exactly one useful translation here, kit, and there’s no way to tell that’s the one you want unless you know Bulgarian. The word for ‘shit’ in Danish is given as junk and the Dutch as shit; I don’t know either language, but I have grave doubts about both alleged translations. For Russian it gives der’mo, which is one possibility but hardly the only one—the basic equivalent for the noun is govno and for the verb srat’, neither of which seems to be known to this Webster’s.

Speaking of which, why “Webster’s”? As they say on their About page, “In no way (other than a common lexicographical heritage) is this project related to dictionaries bearing the trademark or name ‘Merriam-Webster’ (Merriam-Webster, Inc.)… Nor are we affiliated with other book publishers that have created printed or electronic dictionaries bearing the name of Webster.” They begin their explanation:

We were originally interested in honoring Samuel Johnson, but after Black Adder (played by Rowan Atkinson and written by Richard Curtis and Ben Elton) so brilliantly lampooned Dr. Johnson, we simply needed another name. Of course, the name of Johannes Gutenberg was already taken by the very worthwhile Project Gutenberg Electronic Public Library, and we did not want to cause any confusion. We were more than pleased to finally honor Noah Webster…

But eventually they get around to what I presume is the true reason:

Webster’s, often spelled Websters , has fallen into public use as a general word for “American English” or even “dictionary” when one is searching for a definition using Internet search engines. By naming the site and its URL with the term “Webster’s”, we stand a far greater chance of being found on the Internet, thus increasing the impact of this project. No apologies for this are given.

And none are needed, but I hope they improve the product.
Oh, by the way: the rhymes for shit are given as “backseat, beat, beet, cheat, cleat, compete, complete, conceit, concrete, deceit, defeat, delete, deplete, discreet, discrete, downbeat, eat, effete, elite, excrete, feat, feet, fleet, greet, heat, incomplete, indiscreet, leet, Marguerite, meat, meet, mete, mistreat, neat, offbeat, peat, petite, pleat, receipt, repeat, replete, retreat, seat, secrete, Skeet, sleet, Street, suite, sweet, teat, treat, tweet, unseat, wheat.” Something’s clearly gone wrong here. And the synonyms are “lo, lo and behold! O! heyday! halloo! what! indeed! really! surely! humph! hem! good lack, good heavens, good gracious! Ye gods! good Lord! good grief! Holy cow! My word! Holy shit!, gad so! welladay! dear me! only think! lackadaisy! my stars, my goodness! gracious goodness! goodness gracious! mercy on us! heavens and earth! God bless me! bless us, bless my heart! odzookens! O gemini! adzooks! hoity-toity! strong! Heaven save the mark, bless the mark! can such things be! zounds! ‘sdeath! what on earth, what in the world! who would have thought it!; (inexpectation); you don’t say so! You’re kidding!. No kidding? what do you say to that! nous verrons! how now! where am I?” All I can say is, gad so!


  1. Alas, I can’t speak for Bulgarian, but I had a look at the Spanish and Japanese translations for your last search, and they were both accurate. I also tried “linguist” and “ridiculous”, and got translations I know to be acceptable (although I can’t vouch for all of the Japanese translations; my command of the language isn’t that good). I wouldn’t be surprised if they simply had much better references available for those languages than Bulgarian, though.

  2. Well, yes, the bulk of the translations are acceptable (if not always directly on target); the point is that the rate of failure is high enough that you can’t depend on the site to give you the word you need. If you have a bookcase full of foreign-language dictionaries, as I do, you can check and verify… but then you don’t need the site, do you? I’m aware that it would be prohibitively expensive (and/or take forever) to have people go through the words they skim from their online lexica and weed out the ringers, but the result is that it’s not that helpful.

  3. lackadaisy!

  4. Well, “whale” is used colloquially to mean “beat”, so that explains most of the Bulgarian entry, does it not?
    I find it impossible to believe that you don’t already know that, but at the same time find it odd you didn’t elude to that in the discussion, so hey.

  5. Their translation of the shit into Danish must the result of some mistake. ‘Junk’ is slang for drugs!
    (And so is ‘shit’, by the way). A reasonable translation of shit into Danish: ‘Lort’.

  6. But is “whale” used colloquially to mean “beat” in Bulgarian? I think Languagehat gets a pass on that one.

  7. And “shit” in Dutch refers to certain cannabis products.

  8. Of course I know that. My point is that the likelihood of someone’s using an English-Bulgarian translation tool to look up “whale” in the sense of ‘whale on, beat’ is infinitesimal compared to the likelihood of their wanting to know the Bulgarian for ‘a whale,’ and the entry is useless for the latter. In fact, if you try a number of words you’ll find that a great many of the renderings are translations of some synonym of the desired word, often fairly uncommon. This is a problem of using automatic lexicon searching without human input.

  9. *kof* Hoity-toity! Lackaday! I have ystruck mine finger with a hammer.
    And this is useful how exactly?

  10. Err, isn’t that exactly the point of our host’s post?

  11. Good to see cooments about a first draft of a site I just put up a few months ago (I am the editor of The debate over the translations is justified. I am racing to add as many as possible. I am doing a meta analysis across translations, which has its flaws. We will be adding some 2.9 million more words and will sort the table outputs by most frequently encountered (less frequent translations will come last). As time goes on, the results will converge to better translations. Since the EU uses many “drug-related” translations of “shit” and other words, strange results apear as I relied in part on the EU.
    Thanks for the thoughtful comments. Phil

  12. Ah, that’s good to know. Sorry to sound so snarky in my post, but I was reacting to an enthusiastic recommendation by La Grande Rousse, and felt a little let down. I look forward to further improvements; it’s certainly a promising tool.

