Andrew Van Dam of the Washington Post decided to investigate the question What are the most American and most British words? (archived). After a long thumb-sucking introduction (“And for columnists with more curiosity than sense, Google offer lists of millions of words, sorted by year, language and (sometimes) country of publication”) and a fairly tedious excursus on spelling (“colour,” “centre,” “behaviour”: “Much of it goes back to Webster”), he moves on to a list of “Most distinctive words in each dialect, based on how common they are in books published in each country in the 2000s” (top US words File, Schedule, Mail…; top UK words Aim, Inquiry, Catalog…), and then gets more interesting:
In search of deeper differences, we returned to Google Books’ true superpower: time. All our metrics show the two Englishes looked quite similar in the early 1800s but diverged as Webster worked his magick. The gap grew as English immigrants to the U.S. were replaced by other nationalities and the U.S. expanded farther and farther from the Atlantic, Murphy said.
The divergence halted around when World War II’s global mobilization and cooperation increased verbal cross-pollination between the two countries, she told us, “and then the explosion and export of U.S. popular culture and mass comms increased contact.”
You can see this in words such as forever, which used to be a distinctly American spelling of “for ever.” After a rapid rise in the U.K. in the 20th century, it’s just about equally popular in the two dialects. Ditto for payroll, driveway, passageway and viewpoint. Even locate and location, once derided as tasteless and improper Americanisms, have burrowed deep into both dialects.
Today, data shows British English and American English may more closely resemble each other vocabulary- and spelling-wise today than they have at any other point in history. (We say “may” because Google Books has slowed its roll in recent years and the latest decade of data may be less representative.)
That really rattled our rib cages. Are the Englishes more similar than we thought? Are the U.S. and U.K., once divided by a common language, now reunited by one?
We mentioned our bafflement to our former colleague Daron Taylor, the mastermind who created the Department’s videos. Daron made a brilliant point: The differences between the two dialects are better heard than seen.
We’ve been looking at published books. And before a book publishes, editors pile in and polish it into a more standard English — one that feels increasingly intercontinental. We can’t be the only ones who sometimes don’t notice the news story we clicked on came from, say, the Guardian until we hit that (well-earned) plea for donations at the bottom.
And, as Murphy hinted to us, the bulk of books published over the past four centuries would strike most people as rather dry and workmanlike. Google Books indexes endless volumes on, say, weevil prevalence or the finer points of Windows 95. So of course book-derived data won’t surprise or delight us quite as often as actual human speech and action.
The problem? We couldn’t get the equivalent of Google Books for speech, at least not on short notice. The book database took decades and hundreds of millions of dollars to assemble. But we found a shortcut.
You see, we have the words from more than a million English-language television shows and movies, courtesy of OpenSubtitles and OPUS. We don’t know what show or country each word came from, but we can use subtitles to rate the out-loud-ness of any English word by comparing its popularity in movies to its popularity in books. […]
We used our new measure to focus our book-word analysis on those words that people say out loud with at least some regularity. And with every crank of the out-loud-ness dial, we watch the two dialects get less and less similar, and more and more hilarious.
We start with relatively harmless terms. Footballers and whingeing stand out on one side, and statewide and nonfat stand out on the other. But it escalates quickly. British rises from manky and dodgy to shagging to knobhead and bruv. American goes from hoagie and doggone to homegirl and loogie to — at the slangiest echelons of the language — cornhole and bruh.
And those are just the ones they’ll let us print. The widest cross-pond gaps in slang seem to lie among the even-ruder versions of, say, knucklehead or pillock.
So, it sure seems to us that much of the apparent similarity between British and American English applies first and foremost to the dialects as written and published in books.
We blame good editing and a slow convergence in vocabularies and spelling standards. But we won’t know the culprit with any confidence until we round up more data on how folks actually speak.
Thanks, Eric!
ffs, just shell out a few hundred bucks for lists from the Davies project???
Would Jeff Bezos approve such a crazy expenditure?
/mild rant
working with subtitles is an interesting approach – though it raises some serious problems, given that many if not most of the most recent ones are LLM products, and not always close to what’s being said. and i question the idea of taking as typical speakers the narrow cohort of men who make up most tv writers’ rooms and get the most screenplays produced (especially for slang). but the framing and analysis? phew.
the slangiest echelons of the language — cornhole and bruh.
And those are just the ones they’ll let us print. …the even-ruder versions of, say, knucklehead or pillock.
um, slangiest? really? and respectably printable? really?
“bruh” is a variation of “bro” that was commonplace on at least one ivy-league campus thirty years ago. it’s now still informal, but hardly slang, if we understand that as having some relationship to novelty, disreputability, subculturalness, or young speakers. i’m not sure which version of “cornhole” is intended here, but i’m pretty sure beavis and butthead were making “cornholio” jokes in the mid-1990s, using an “anal sex” meaning that’d been around decades longer. okay, sure, slang, but more in the manner of “a stone fox” than, say, “nussy” – and what possible standard of non-rudeness is being appealed to here?*
.
* a slightly rhetorical question, because obviously it’s the one george carlin made famous, where the actual content doesn’t matter as long as you don’t use specific lexemes.
Ben Yagoda and Lynne Murphy blogged on a 2018 paper with very different results from “an online crowdsourcing study* involving over 220,000 people”. About awareness rather than frequency of usage.
*Is it still possible for reputable academics to do online crowdsourcing studies?
I don’t know how it affects the observation exactly, but I always see plenty of articles in The Guardian written by Americans.
‘Catalog’ (sic) is British? Really?
Was “rattled our rib cages” supposed to be a joke? I find it hard to take writing about demotic* usage seriously when it involves misuse and/or misunderstanding of standard idiomatic metaphorical expressions. Moreover, what are they talking about with cornhole? It’s a word that started with a very regional usage profile for a couple of older inoffensive meanings and which was long ago** adopted as slang for anal sex, seemingly because it includes “hole” and sounds funny. Is this writer unaware that most uses of cornhole (regardless of medium) are still references to the beanbag tossing game?
* Spellcheck software seems to have come full circle. In ninth grade, for one English assignment, our twenty-three-year-old teacher said we could not turn in our two-page writing assignments until we could show that the word processing software’s spell checker reported no misspelled words. I don’t remember the name of the program, but it was one of the first ones that tried to provide feedback about why you might have misspelled a word and what you actually meant. However, its dictionary was woefully deficient, including telling me at one point that I shouldn’t abbreviate “demonstration” by shortening it to “demon.”
Now, auto correction changes “demotic” when I type it to “demonic.” I pine for Screwtape’s simple typewriter.
** In contrast, I would be surprised if bunghole as commonplace slang for anything is older than Beavis and Butthead. Note that in the original context, the over-caffeinated Cornholio persona uses bunghole to mean both “anus” and “mouth,” as well as in more nonsensical ways.
@Brett
Definitely older than Beavis and Butthead, I heard someone described as “[NAME] B**tf**k [LAST NAME], the bunghole bandit” as early as 1978 in the US. I do not believe this was an original coinage.
Green takes it back to… wait for it… 1611. (A classy bit of verse from 1682: Mennis & Smith et al. ‘On a Fart’ “And what is working Ale I pray But Farting Barm which makes away At Bunghole, with Farting noise.”)
I’d have taken that one as literal – the beer is foaming away while brewing in the barrel, and makes farting noises when it comes out the hole where the bung is.
If there were no other evidence, sure, it could be purely literal, but as it is it’s clearly playing on both senses (cf. the 1611 Cotgrave quote: “A small and ouglie fish, or excrescence of the sea, resembling a man’s bung-hole, and called the red Nettle”).
On the relatively few occasions when I have done a search limited to either the “American” or “British” subcorpus of the google books corpus and then dug into the actual hits to look at context etc. I have found quite a lot of texts that seemed geographically miscoded. But maybe my experience is unrepresentative or they’ve gotten better. Or maybe you can have a reasonably high percentage of miscoding without that noise overcoming a signal as clear as color v. colour.
I think you’re right, JiE. That witty and droll poem, in Mennis et al.’s Wit and Drollery, is an enumeration of such comparisons.
Surely there is an allusion to the other meaning, but this is just one of a number of comparisons and the rest are punless.
Good point. I’m glad it’s not the earliest citation.
Okay, not entirely punless: “Music’s but a Fart that’s sent / from the Guts of an Instrument.” “Farts are as good as Land, for both / We hold in Tail, and let ’em both.” (I can tell that “hold in tail” is a pun on some legal term, but I don’t know it.)
OED s.v. tail:
E.g. 1796 “All estates given in tail..shall become fee simple estates to the issue of the first donee in tail” (J. Morse, American Universal Geography (new edition) vol. I. 463).
I expect J.W. Brewer will have something to say about that.
Entailment is one of those historical things you tend to learn about in your first-year property law class in law school and then get fuzzy about because it’s no longer actively a thing in the U.S., with the several states (each on its own timeline) having abolished it along with various other of the more feudal details of inherited English land law more than 200 years ago.* My firstborn just started law school (not necessarily on my advice …) and I do hope her property class next semester will still contain a reasonable number of no-longer-immediately-relevant antiquities like that.
*Indeed, I suspect that 1796 quote is describing the mechanism via which one of those abolition statutes was being phased in.
Jane and Elizabeth attempted to explain to her the nature of an entail.
No, they’re not the only ones who don’t mouse over a link to see what it is before they decide to actually click on it.
Why that is is beyond me.
Jen: ‘Catalog’ (sic) is British? Really?
That ranking was calculated after merging spellings, i.e. counting “catalog” and “catalogue” as the same word. Probably the list should’ve shown it as “catalog(ue)”.
My subjective feelings agree that “aim” (especially as in the aim of the project) and “inquiry” seem more British than American, but “catalog(ue)”? Huh.
Since they got the numbers from subtracting one country’s frequency from the other, the top-ranked words have to be high-frequency, which I think is why they’re so boring. (Even the columnist admits that his result is “even more boring”!) The survey that mollymooly mentioned is more fun; its top 10 asymmetrically recognized words were:
UK: tippex, biro, tombola, chipolata, dodgem, yob, gazump, abseil, naff, kerbside
US: manicotti, ziti, tilapia, garbanzo, kabob, kwanza, crawdad, hibachi, sandlot, acetaminophen
Failed spelling merger with curbside?
The thing at the side of the road is a “kerb” in Brit. So you park your lorry or van at the kerbside. No mergers were harmed in the creation of this word.
What do projects have in America? Are they just aimless?
The US sign “curb your dog” is confusing to Brits.