EXPLORING VIA NGRAM.

September 16, 2013 by languagehat 18 Comments

Elizabeth Weingarten has a piece in Future Tense (“a collaboration among Arizona State University, the New America Foundation, and Slate” that “explores the ways emerging technologies affect society, policy, and culture”) on Google’s Ngram Viewer; it doesn’t break any new ground, but it links to some good stuff and presents nice tidbits:

“There are hundreds of little mysteries that one can resolve with the Ngram Viewer,” says Erez Lieberman Aiden, a founding father of the Viewer and the field of Culturomics (which studies human culture and history through the lens of massive datasets) and fellow at the Harvard Society of Fellows. Take the mystery of donuts vs. doughnuts. When did the spelling change? Before the Ngram Viewer, “it would’ve taken a very long time to determine when that spelling transition took place,” Aiden explains. But according to the Viewer, the donut spelling starts to take off in early 1950s, right around the time Dunkin’ Donuts opened its first store. Of course, it doesn’t prove that Dunkin’ Donuts alone changed the spelling—but it does add a compelling dimension to the story.

I do have to take issue with this: “The Viewer also helps corroborate larger, semantic debates—like, do words actually evolve in the Darwinian sense? … [Researchers] discovered that the verbs did undergo a kind of evolutionary process. ‘The less frequent the verb, the more rapidly it becomes irregular,’ Aiden explains.” Yeah, no. It is impossible for words, or anything non-biological, to evolve in the Darwinian sense. If you insist on using evolution as a metaphor, best to just slip it in there quietly and not try to pin it down as “Darwinian.” Because that just looks silly.

Comments

John Cowan says

September 17, 2013 at 12:01 am

It doesn’t account for the use of donut in BrE, where it is more standard than doughnut, though.
Tikitu says

September 17, 2013 at 2:16 am

Actually, Ruth Garrett Millikan has a theory of meaning built on taking this analogy extremely seriously. (I happen to think she’s on the right track, but of course YMMV.) As a logical structure Darwin’s theory requires “descent with modification” (a copying process that is not 100% accurate) plus some form of selection (non-random preferential survival of some copies over others over the long term); there are certainly ways of looking at how we use small pieces of language (words, quotations, idioms, perhaps even syntactic structures) that seem to fit.
Not that I would expect the article to have any of that in mind, but maybe you find the perspective interesting; I certainly do.
D-AW says

September 17, 2013 at 9:16 am

I like the idea of solving little mysteries. A couple I’ve taken whacks at in the last little while:
http://poetry-contingency.uwaterloo.ca/lets-you/
http://poetry-contingency.uwaterloo.ca/create-creative-creativity/
… but I’m skeptical of much of the big picture stuff (Darwinian language change being an egregious example – there are plenty of less obviously outrageous ones with currency).
marie-lucie says

September 17, 2013 at 9:50 am

“The less frequent the verb, the more rapidly it becomes irregular”
This does not make any sense at all. Irregular forms are typically very old ones which have remained in the language because they are extremely frequent, so speakers have them stored in memory and do not have to think about how to form them every time they use the word: an obvious example is the forms of the verb to be in most Indo-European languages, including English (be, been; am, are, is; was, were, plus dialectal forms). Less frequent forms tend to become regularized, not the opposite (eg OE had holp, later replaced by helped, among many such regularizations). Perhaps the author is thinking of some verbs where speakers may hesitate or re-form because there is a more common, more intuitive model: eg saying brang or brung instead of brought as the past form of bring, on the model of other Cing verbs (“C” = any consonant). This may still be considered “irregular”, but much less so than brought from the point of view of word recognition. When there is morphological change over time, regularization (conforming to a more common model) is the norm, not “irregularization” (a possible word, but not one I have run into before), unless very occasionally in a jocular context.
Ran says

September 17, 2013 at 10:44 am

@marie-lucie: But there are some examples of irregularization; consider sneaked vs. snuck and dived vs. dove. (Granted, these are the exception rather than the rule, but provided you have a large enough sample of words that underwent irregularization, it can still make sense to discuss which ones did so the fastest.)
MattF says

September 17, 2013 at 10:56 am

@m-l
I had the same thought. In fact, the greater stablity of rarer linguistic forms seems to me to be a strong argument against an evolutionary model for language change.
John Cowan says

September 17, 2013 at 11:44 am

“The less frequent the verb, the more rapidly it becomes irregular”
This sounds like a bassackwards version of the truth, which is that the less frequent the verb, the more slowly it becomes regular. (Indeed, whole classes of words can become irregular at the stroke of a sound change.) Just a matter of misremembering a sentence with the negation in the wrong place. By the way, this is a fine example of how completely productive the “the … the …” sentence type is in contemporary Modern English; some grammarians seem to believe it only exists in frozen proverbs like “The more the merrier”.
As for evolution, to a biologist that is any change in allele frequencies or their expressions regardless of the cause. If more people merge Wales and whales, or use may for might, or philosophy in the sense of ‘policy’, that is unmetaphorically evolution. The question is, how much linguistic evolution is due to natural selection (which is Darwinism) and how much is due to random shifts in the cosmos (which is neutralism)? Nobody knows, but Darwinian explanations are certainly not ruled out.
Stu Clayton says

September 17, 2013 at 12:17 pm

MattF [replying to m-l]: I had the same thought. In fact, the greater stablity of rarer linguistic forms seems to me to be a strong argument against an evolutionary model for language change.
I don’t see anything in marie-lucie’s comment that could be understood as a claim that “rarer linguistics forms are more stable”. She wrote:

Irregular forms are typically very old ones which have remained in the language because they are extremely frequent, so speakers have them stored in memory and do not have to think about how to form them every time they use the word …

It is the irregular verb forms that are “more stable”, not the rarer ones. “Irregular” does not mean “rare”, but rather “not conformant with regular patterns, defined as follows: [insert here a description of, say, English regular conjugations]”.
Apart from that, I’m not even sure what a “stable rare form” might be, linguistic or otherwise. The expression is a woolly ball of probabilistic and essentialist notions.
MattF says

September 17, 2013 at 5:52 pm

@Stu Clayton
That’s how I interpreted m-l’s comment “Less frequent forms tend to become regularized, not the opposite” So, e.g., evolutionary drift makes small, isolated populations separate faster than larger populations, but rarer linguistic forms tend, in contrast, to converge to a ‘regular’ norm. But maybe this is stretching the analogy. In any case, my point is that it seems to be an analogy that doesn’t work very well.
marie-lucie says

September 17, 2013 at 7:42 pm

Ran: But there are some examples of irregularization; consider sneaked vs. snuck and dived vs. dove.
These are interesting examples, because what we see here is not so much “irregularization” but indeed a form of regularization, the creation of forms according to an existing pattern in a certain group of words, here verbs with a common phonological and phonotactic shape (one-syllable words of the shape (C)CVC). Concerning causation, the verbs in question are not that terribly frequent that they have an established, sutomatically used past form shared by all speakers, and more importantly from the point of view of form, their sound pattern causes speakers to classify them together with verbs with a similar sound pattern and to give them a form compatible with regularities present in verbs of this shape: the vowel-changing pattern seen in so-called ‘irregular verbs’ is still strong (especially in non-standard speech), even though the exact nature of the vowels involved is not quite predictable.
Dove instead of dived is obviously a type of regularization, away from the suffixing pattern but on the model of the existing vowel-changing pattern of the much more common verb drive:drove. Sneak is interesting because it could pattern with speak:spoke or break:broke, but the o vowel is less common than the u one (“short u”) in vowel-changing forms (see for instance the alternate participle of think in “Who’d have thunk it?”). If the verb freak (a recent word, or at least one recently become popular) sticks around for some length of time, freak could also evolve a vowel-changing past form replacing the current freaked.
John Cowan says

September 18, 2013 at 12:12 am

It’s not so much that irregular forms become regular, but that they are lost because they are not frequent enough for children to learn them thoroughly, after which regular equivalents take over. The rare preterite smote is in process of being lost from English, at which point the regular preterite smited replaces it, because people who don’t have smote stored in their memories apply a rule to construct the past tense of smite on the fly when they need it. (Of course, smite is rare to begin with, but it is still more common than smote.) So what we are seeing with these irregular forms is in fact founder effect, only it is the individual human child in linguistics who corresponds to the newly isolated group of organisms in biology.
marie-lucie says

September 18, 2013 at 2:25 pm

JC is right: it is not that “a verb becomes regular” on its own, but that speakers create a new form if they don’t know or have forgotten the old one. Small children do this (eg hitted instead of hit) even if they understand the traditional form used by their parents, but that form just does not sound right to them as its purpose (eg past tense formation) is not obvious. On the other hand, I think that the “irregularization” mentioned above (eg sneaked to snuck, actually a regularization based on the minority model) is more likely to come from teenagers and adults.
Some linguists would attribute all linguistic change to the imperfect transmission of language through generations, but I think that this position is too extreme. Both children and adults have a role, for different reasons and through different means.
juha says

September 19, 2013 at 2:27 am

verbs with a common phonological and phonotactic shape (one-syllable words of the shape (C)CVC).
One disyllabic verb with a similar structure comes to mind: arrive-arrove-arriven
John Cowan says

September 19, 2013 at 10:15 am

Juha: Arrive, being a loan word, was weak (regular) when it landed in English in the 13th century, as it is today. In the 14th and 15th centuries it was shortened to rive, and apparently picked up the inflections of the existing strong verb rive ‘pull, tear, tug, split’ (usually only the last of these today). That verb was itself borrowed from Norse, but at such an early date that it, er, arrived with its strong inflection intact, although the old preterite is now lost in favor of a weak form rived.
marie-lucie says

September 19, 2013 at 2:48 pm

arrive-arrove-arriven
Where are these forms found? in actual recorded instances, or just in some joker’s imagination?
It is true that it is disyllabic, but it could be taken for a derivative of rive (thanks JC), with the old a- prefix as in the verbs arise, alight and some others. In any case, these vowel-changing forms have not taken over from the suffixed forms.
juha says

September 20, 2013 at 1:15 am

Example usage of word arrove
The last Bushism?
John Cowan says

September 20, 2013 at 4:20 pm

Juha’s examples are surely independent creations. The OED lists the forms rove, arofe, aryven but gives no examples of the last two, only the first:
1387 J. Trevisa tr. R. Higden Polychron. Rolls Ser. VII. 87/1 Þe navy of Danes rove up at Sandwyche [Sandwicum appulit].
ktschwarz says

April 7, 2025 at 6:34 pm

I was just looking at Stephen Chrisomalis’s review of the “indie linguistics-themed horror film, Pontypool”, which cites one of the papers mentioned in the Slate piece. But Slate misrepresents it:

“The less frequent the verb, the more rapidly it becomes irregular,” Aiden explains.

marie-lucie noticed: “This does not make any sense at all.” She was right, of course; the source either misspoke (and the journalist wasn’t paying enough attention to go back and ask “Wait, didn’t you mean that the other way around?”), or was just misquoted. The journalist also got the source’s name wrong: it’s Lieberman Aiden, not Aiden.

What Lieberman Aiden actually said, in Quantifying the evolutionary dynamics of language (the 2007 paper mentioned), is: the less frequent the verb, the more rapidly it becomes *regular*. That paper was also summarized in a 2018 comment by John Cowan.

Another paper on “evolutionary forces” and verb regularization, discussed here in 2017, disputed this, saying that “In Modern English, we find that irregularization is as common as regularization”, but see that thread for criticism of their method.

EXPLORING VIA NGRAM.

Comments

Speak Your Mind

Archives

Search

Recent Posts

Recent Comments