When Did Human Language Emerge?

Peter Dizikes writes for PhysOrg:

It is a deep question, from deep in our history: when did human language as we know it emerge? A new survey of genomic evidence suggests our unique language capacity was present at least 135,000 years ago. Subsequently, language might have entered social use 100,000 years ago.

Our species, Homo sapiens, is about 230,000 years old. Estimates of when language originated vary widely, based on different forms of evidence, from fossils to cultural artifacts. The authors of the new analysis took a different approach. They reasoned that since all human languages likely have a common origin—as the researchers strongly think—the key question is how far back in time regional groups began spreading around the world.

“The logic is very simple,” says Shigeru Miyagawa, an MIT professor and co-author of a new paper summarizing the results. “Every population branching across the globe has human language, and all languages are related.” Based on what the genomics data indicate about the geographic divergence of early human populations, he adds, “I think we can say with a fair amount of certainty that the first split occurred about 135,000 years ago, so human language capacity must have been present by then, or before.”

The paper, “Linguistic capacity was present in the Homo sapiens population 135 thousand years ago,” appears in Frontiers in Psychology.

All told, the data from these studies suggest an initial regional branching of humans about 135,000 years ago. That is, after the emergence of Homo sapiens, groups of people subsequently moved apart geographically, and some resulting genetic variations have developed, over time, among the different regional subpopulations. The amount of genetic variation shown in the studies allows researchers to estimate the point in time at which Homo sapiens was still one regionally undivided group. Miyagawa says the studies collectively provide increasingly converging evidence about when these geographic splits started taking place.

Anyone who has followed LH for any stretch of time will not be surprised that I have the gravest doubts about all this (the logic of MIT is not the Hat’s logic), and Bathrobe, who sent me the link, also feels uncomfortable about it, but I figured I’d put it out there and see what y’all have to say.

Comments

  1. They had me at “The logic is very simple.”

    This is not simple, even though they pretend to make it so in ca. 2000 words. It is not logical, because they ignore cross-lineage transfer: what if speech genes developed in only one post–135 kybp lineage, and were transferred later to others? What if some non-speaking late lineages existed but did not survive (possibly by the very disadvantage of having no language?)

    And, the links between language and symbolic behavior, as inferred from archaeology, are bold-italic-underlined *all* speculative, no matter how many papers have tried to squint hard so as to see some evidence for such.

    I propose another scenario: long before man could use language, he could wave his hands and arms in a convincing manner.

    (I personally think that language is much older than 135 ky, but I wouldn’t write a paper about it.)

  2. David Eddyshaw says

    Miyagawa has apparently decided that all languages are related because he believes in Chomskyan UG, and believes the Chomsky story that UG must have a genetic basis. This has absolutely nothing to do with actual scientific historical linguistics, and is a quite extraordinary petitio principii. As far as I can see, he has no expertise in comparative work at all.

    In fact, his premises do not actually entail that all human languages even are all related, in the sense (the sole acceptable one) that the term is used in real comparative work. If humans are genetically equipped with UG, there is no reason in principle why several groups might not have developed language separately. Miyagawa presumably does not consider this possibility, because he takes for granted the Chomskyan origin story that the “language organ” arose all at once by a mutation in one lucky individual somewhere.

    It is conceivable that his arguments might still hold water without the daft statement that “all languages are related”, which (to be fair) he probably does not even intend to be interpreted in its normal linguistic sense.

    However, he does not seem to be an expert in genetics either. He does seem to be a genuine expert in Japanese linguistics (and all credit to him) and in the Chomskyan Major Arcana.

    And if he is assuming a single mutation as the “origin of language” then all this labour is pointless.

  3. Dude has surprisingly weak critical thinking skills.

  4. Language does not arise in a vacuum. In order for a new sign language to develop you need to have several deaf kids – otherwise you get home sign, which is not as versatile.

    The genes for language could have spread pretty far before one or more groups had enough language capable kids at the same time to start chattering to each other.

  5. David Eddyshaw says

    OK: I’ve read the paper.

    Boils down to: all known modern human groups have language, and all known languages have quite a bit in common typologically. (This is true, up to a point, as far as we know: at least, all known human languages share many features which are at most marginally present in other animal communication systems. This is emphatically not well expressed as “all human languages are related”, and Miyagawa should be more careful in his word choices.)

    Therefore, language probably goes back at least to the emergence of anatomically modern humans. There then follows a lot about “Khoisan” speakers as being the most genetically divergent groups of modern humans, and how far back this divergence should be projected.

    There are some potentially questionable assumptions involved in all this, but frankly it seems both plausible and – unsurprising. I can’t say I’ve ever gone for the once-popular idea that human language only arose about fifty thousand years ago.

    This is conceived, reasonably, as just a lower limit for the age of language. It has no bearing on whether, say Neanderthals had language; that just comes back to all the old assumptions about language and symbolic culture going together. If anthing, this paper would undermine any such facile linkage, because it seems to project the origin of modern-style language back to well before the “Upper Palaeolithic Revolution.”

    There’s an actual geneticist on board in the paper. No comparativists, but comparative linguistics is actually not relevant to what they’re trying to do anyway. Despite Miyagawa’s careless talk.

  6. I think that’s still not the argument. As I understand it, DNA analyses reviewed by the authors suggest that the first split of humanity into largely separate populations—between the ancestors of the Khoisan ethnic groups and the rest—occurred about 135,000 years ago. Both the Khoisan populations and the rest [slight edit there] have languages, so the genetic basis of language ability must have been present in humanity before the split (though possibly long after the emergence of anatomically modern humans). The authors ignore the possibility raised by Y that language ability might have appeared on the language-incapable side of the split by post-split genetic introgression from the language-capable side. (I have no idea how likely the DNA evidence makes that.)

    The authors then note that some archeological phenomena such as burial and various kinds of decorations become widespread around 100,000 years ago. They suggest that those phenomena might have spread due to an emergence of language around that time or a little earlier.

    Irrelevant prescriptivism: I agree with those who say that “genetic” has to do with genes, so studies that cover all the DNA of a species are DNA studies, not genetic studies.

  7. David Eddyshaw says

    Yes; it also confuses the issue that linguists say languages are “genetically related” to mean that they “descend” (metaphorically) from a common “ancestor” (also metaphorical.)

    You’d hope that the domains involved are so disjoint that confusion would not arise in practice, but experience teaches us otherwise. Alas.

    possibly long after the emergence of anatomically modern humans

    Yes, I was being careless in saying “anatomically modern humans” rather than something like “last common ancestors of all contemporary humans.” Though, personally, I find it very unlikely that modern-style language – more or less – doesn’t go at least as far back as anatomical modernity. And that makes their conclusions in the paper even less astonishing. (And even so, they still seem to me to involve some circular logic.)

  8. Trond Engen says

    When I saw the headline I thought it would be about this paper:

    Yoko Tajima et al: A humanized NOVA1 splicing factor alters mouse vocal communications ,Nature Communications 16 2025:

    Abstract
    NOVA1, a neuronal RNA-binding protein expressed in the central nervous system, is essential for survival in mice and normal development in humans. A single amino acid change (I197V) in NOVA1’s second RNA binding domain is unique to modern humans. To study its physiological effects, we generated mice carrying the human-specific I197V variant (Nova1hu/hu) and analyzed the molecular and behavioral consequences. While the I197V substitution had minimal impact on NOVA1’s RNA binding capacity, it led to specific effects on alternative splicing, and CLIP revealed multiple binding peaks in mouse brain transcripts involved in vocalization. These molecular findings were associated with behavioral differences in vocalization patterns in Nova1hu/hu mice as pups and adults. Our findings suggest that this human-specific NOVA1 substitution may have been part of an ancient evolutionary selective sweep in a common ancestral population of Homo sapiens, possibly contributing to the development of spoken language through differential RNA regulation during brain development.

    It’s no revolution, just another small piece in the puzzle. Not much time to digest, so I’ll just throw in a couple of paragraphs more:

    Introduction
    Fossil records indicate that modern humans (Homo sapiens) emerged 200,000–300,000 years ago as the predominant species from a common ancestral population. Humans differ significantly from their closest living relatives, the great apes, particularly in their ability to communicate through complex learned vocal communication, a necessary component of spoken language. This complexity is driven by some anatomical adaptions of the vocal tract and intricate neural networks linking various brain regions. However, the genetic basis underlying these specialized human traits remains to be fully identified.

    The closest evolutionary relatives of modern humans are two extinct lineages: Neanderthals and Denisovans. Genome sequencing from fossilized remains of these archaic humans has identified distinct genetic differences between them and modern humans, which may be relevant to recent human evolution. Additionally, the availability of extensive human genome data over the past few decades initially focused on European populations, has significantly expanded the scope of evolutionary studies.

    […]

    In this work, we used gene-editing to substitute the NOVA1 isoleucine (I) isoform present in most mammals and archaic hominids (Neanderthals and Denisovans) with the human-specific valine (V) variant at position 197 in mice. Comparison of these humanized NOVA1 mice (Nova1hu/hu) with wild-type mice carrying the ancestral Nova1 gene (Nova1wt/wt) revealed specific transcriptomic and behavior differences related to vocalization. Taken together, the unique role of NOVA1 in neurons, its association with human disease, and evidence that the human-specific amino acid 197 variant confers vocalization changes in humanized mice suggest a role for NOVA1 in the evolution of human-specific language.

    […]

    Discussion
    In line with studies of genetic variants that have played a role in the evolution of modern humans, we investigated the biological effect of a single amino acid substitution, I197V in NOVA1, which is unique to modern humans. By analyzing Nova1hu/hu mice carrying this allele, we identified molecular changes in alternative splicing in the brain, including brain regions associated with vocal behavior, and identified changes in vocalization patterns in pups and adult mice. These findings suggest that during human evolution, the I197V substitution in NOVA1 protein may have contributed to the development of neural systems involved in more complex vocal communication.

    […]

    In summary, we analyzed a single amino acid unique to modern humans in the RNA binding protein NOVA1 and examined its biological effects in vivo by introducing this amino acid in mice. NOVA1 is highly intolerant to changes in amino acid sequences during evolution with the exception of this single amino acid change in humans. We propose that this change was part of an evolutionary sweep associated with specific changes in the neuronal transcriptome and vocal communication.

  9. David Eddyshaw says

    Chomsky hisownself seems to have seized on the supposed Upper Palaeolithic Revolution as supporting his own monogene fantasy about the origin of recursion/Merge/UG/whatever.

    I wonder to what extent the far-from-obvious linkage between “behavioural modernity” and language has actually been imported into the discussion either by actual Chomskyites, or by the (sadly not few) archaeologists and geneticists who are under the impression that Chomsky’s views are actually the universally accepted wisdom on all matters linguistic?

  10. Who is to say the chimpanzee line didn’t also have speech, until giving up on it when they realized the conversation had grown stilted and they didn’t have that much to say to each other.

  11. A new survey of genomic evidence suggests our unique language capacity was present at least 135,000 years ago. Subsequently, language might have entered social use 100,000 years ago.

    I’m not following why language “social use” wouldn’t have started immediately there was language capacity. (At this time-depth, by ‘immediately’ I mean within a handful of generations.) Or if not vocal-tract language, symbolic communication through hand-waving, facial expressions, …, the whole concept of pointing ‘at’/’to’ something.

    … how far back in time regional groups began spreading around the world.

    I’d’a thought (social) language use would be a precondition for _population_ dispersal. One individual wandering off into the bush doesn’t make a “group”/doesn’t make a viable vector for “spreading”. If you’re going to take a whole (extended) family/tribe, you need to plan/agree what to take and where to go/send scouts in advance.

    I’m wondering if this “social use” demarcation is a hallucination from Dizikes? I can’t see in the paper any such demarcation, especially:

    We wish to note the specific role that language may have played in organizing, and hence systematizing, modern behavior. Our proposal is similar to earlier suggestions by Henshilwood and others, but is based on a concrete and verifiable date of approximately 135 kya as the lower boundary for the presence of language. As the most complex communication tool yet devised in nature, it had a direct and enormous impact on all facets of human life. Language, with its complex system of mental representations and rules for combining them, is able to create new ways to connect existing symbols and predict new ways of behavior.

    I’ve extended the quote there to demonstrate the Chomsky-inspired bollocks. ‘mental representations’ surely precede language. ‘combining them’ happens organically, because the world is organic. I’d be pretty sure gorillas and dolphins/whales survive by complex representations/combinations.

    The genomic basis for 135kya is informative. The surrounding speculation is arm-waving metaphysics.

    (Sabine Hossenfelder recently put out a video suggesting the main purpose for funding academic research is to employ academics. She was chiefly talking about particle-colliders, but mutatis mutandis …)

  12. Extra! Extra! Read all about it! Fox reports grapes sour anyway!

  13. David Eddyshaw says

    the main purpose for funding academic research is to employ academics

    POSIWID!

    Chomsky-inspired bollocks

    Yeah, that was my immediate thought. Naturally, people with Language would spend tens of thousands of years Just Thinking with it before it occurred to anyone that it might help with communication. What could be more plausible than that?

    [Chomsky concocted this as an epicycle to “explain” away one of the many frankly impossible features of his language-mutation scenario, as I imagine all Hatters know.]

    Before we had AI, children, we had to pollute the noosphere manually!

    Language does not arise in a vacuum

    Now that is the absolute key observation. Language, as everybody once knew when linguistics was a branch of anthropology (rightly), is inextricably entwined with culture.

    Just as the sole interesting fact about LLMs is that they prove that many things we thought needed intelligence can in fact be carried out by stochastic parrots, the key discovery of Chomskyism is that you can get a remarkably long way in grammar without taking actual human relationships into account. And just as AI has no actual intelligence at all, so Chomskyanism can never be the basis of an adequate account of Language.

  14. To be fair, while Chomsky is referenced several times, it is either in a neutral context (abrupt vs. gradual emergence of human language), or they are arguing against his argument for a younger date.

    Not quite speaking of which, I checked G. Scholar for 2024 and 2025 papers referencing Greenberg’s LIA. None of them are about genetics or archaeology. Most seem to be substantial linguistic papers (presumably saying that “Greenberg included […] within his […] sub-phylum, but no one believes that anymore.”) A few are about automatic searches for long-distance connections. It seems like LIA may have finally obsolesced.

  15. David Eddyshaw says

    On, comrades, to The Languages of Africa!

    Magna est veritas, et praevalebit!

    (In fairness, LOA is less wrong than LIA.)

    I agree that the paper, though retailing Chomskyan tropes as truisms in an irritating way, is not actually logically dependent on specifically ANCish doctrines.

    The linguistic side of it is really typology: it wouldn’t be possible to project Language back to the ancestors of modern humans prior to our current (minimal) state of genetic diversity, if contemporary Language itself did not display some sort of unity.

    However, the more actual typologists discover about even contemporary language, the more it begins to look as if the only real thing that unites all human languages over against other animal communication systems is that humans are constrained by human physical capacities.

    So the grand conclusion of the paper boils down to humans speak Human, and Human is spoken by humans.

  16. Stu Clayton says

    I propose another scenario: long before man could use language, he could wave his hands and arms in a convincing manner.

    The article demonstrates that hand-waving is still an essential part of serious discourse.

    It’s not trivial. A limp wrist conveys so much information. And its absence is even more promising.

  17. Now I remember — one of the recent LIA references is in this paper by Johanna Nichols. She attempts to reconstruct typology far past where comparative linguistics would go. She probably knows this, but There There Be Dragons.

  18. It’s hard to believe a paper that short can be that awkward. They never bother to define language. Dizikes felt the need and provided them with a (vague) definition in his review, but it’s not in the paper.

    Their stirring final conclusion is a sentence of ridiculousness— “we have pinpointed approximately 135kya as the moment at which some linguistic capacity must have been present in the human population.”

    Approximately ain’t a pinpoint, boys. And it isn’t “the moment” either since you left open the possibility of an earlier date. So it’s “a moment”. And “some language capacity”? WTF? Chimps have “some language capacity”. You’ve basically concluded nothing at all.

  19. Stu Clayton says

    Limp-wristed and yet bold as brass. I think “campy” is a fair description.

  20. “we have pinpointed approximately 135kya …”

    IIRC, the best the genomics can measure is the number of generations. There’s then an averaging/guessing game of mapping generations to years-per-generation to time-displacement. Was the (average) age a woman gave birth consistent over that span of time? Wouldn’t it be more honest to give a figure in terms of generations, with an acknowledged probabilistic mapping to years?

    a paper that short can be that awkward.

    Easily explained: they did no original research at all. This is a highly derivative literature review.

    This is the sort of statistical leger-de-main that gets genomics a bad name. So all too easy for you-know-who in Another Place to rail against “speculative, statistically derived results”. And although I of course defend to the death @DM’s right to object, I’m keeping my head down on that one.

  21. On the “vocal tract” paper – that would allow a scenario where language in the sense of using sounds to convey meaning had already arisen and having a more versatile vocal tract would be of social advantage, which would further the spread of that mutation.

  22. David Eddyshaw says

    She attempts to reconstruct typology far past where comparative linguistics would go.

    Yet again. It seems to be her signature tune now (a pity, given that she long since showed that she can do excellent proper linguistics. Couldn’t well-wishers stage some sort of intervention to rescue her?)

    There’s so much wrong with that paper it’s hard to know where to start. Though I notice lexical tone is one of her typological features that she thinks can substitute for actual comparative work. In Africa, the distribution of lexical tone happily crosses over not only real language family divisions, but even Greenberg’s. Et sic de similibus.

    When her methodology can actually be checked against known family relationships, it doesn’t work. There is consequently no reason to trust it in cases like these where the only checks are hypothesised “Pleistocene time windows” for the peopling of the Americas.

  23. David Marjanović says

    the main purpose for funding academic research is to employ academics

    Academics live off funding for academic research, except the lucky few who got tenure or another kind of permanent position. 😐

    Approximately ain’t a pinpoint, boys. And it isn’t “the moment” either since you left open the possibility of an earlier date. So it’s “a moment”. And “some language capacity”? WTF? Chimps have “some language capacity”. You’ve basically concluded nothing at all.

    It’s just bad writing. They concluded that 135 ka ago, give or take a thousand or three, is a terminus post quem – it’s the last possible date when “some linguistic capacity” must have been present in the human population. I’d say “some linguistic capacity” is narrower than “some language capacity” – they still should have come out and simply said “language”, but probably a reviewer didn’t let them.

    Wouldn’t it be more honest to give a figure in terms of generations, with an acknowledged probabilistic mapping to years?

    Yes, but a Conclusions section should be aimed at a wider audience.

    It seems to be her signature tune now

    She’s painted herself into a corner. She argued against the Moscow School’s “North Caucasian” family by arguing that the Comparative Method simply cannot go that far because it has some silly time limit like 10,000 years. So when she wanted to go further (without arriving at “North Caucasian”, which must be wrong because it comes from the Moscow School…), she had to try to convince herself that some other method could go there.

    In Africa, the distribution of lexical tone

    Not just there. Latvian is a proper (register-)tone language: the phonemic tones are not limited to the stressed syllable, and stress is not phonemic.

  24. David Eddyshaw says

    Yeah, examples abound. Athabaskan, Sino-Tibetan … Scandinavian …

    It’s nothing short of astonishing that anyone could even consider typology a safe guide to genetic relatedness …

    Yoruba is quite certainly “genetically” related to Swahili. I can’t say that I’ve ever really been struck by their typological similarity.

    Goemai (largely monosyllabic words, almost entirely isolating in its morphosyntax) is related to Amharic (somewhat less monosyllabic and isolating.)

    Welsh is really quite closely related to Lithuanian … (much shallower time-depth than Kusaal and Zulu, anyhow. Or Goemai and Amharic.)

    Quechua is very reminiscent of Turkish typologically (if you ask me, which nobody did.)

    I think the idea is that if you stir together enough typological features and do some statistics, all the nonsense will sublime and you will be left with Truth.

  25. David Marjanović says

    Scandinavian …

    That can be defined away as “merely” pitch accent, like Central Franconian, Lithuanian and western South Slavic. (Or Shanghainese.) But Latvian can’t.

    I think the idea is that if you stir together enough typological features and do some statistics, all the nonsense will sublime and you will be left with Truth.

    That might even work if you can find enough typological features that are independent from each other and not too easily borrowed. I don’t think anybody has yet.

  26. the only real thing that unites all human languages over against other animal communication systems is that humans are constrained by human physical capacities.

    If I had to pick one, it would be: all human languages, and (for what little it’s worth) no known animal language, provide a way to negate a statement.

  27. David Marjanović: There’s recent suggestions that South Slavic is not even a thing? (as in Slovenian is not more closely related to FYLOSC that it is to, say, Czech?). Slovenian being a separate branch of Slavic. And the phonological common innovations Bulgarian has with FYLOSC are areal features which occurred dys-synchronically.

  28. Trond Engen says

    David E.: However, the more actual typologists discover about even contemporary language, the more it begins to look as if the only real thing that unites all human languages over against other animal communication systems is that humans are constrained by human physical capacities.

    I didn’t plan to pick up on this, hyperbole and all, but now that we do discuss it: Surely there’s something — or else you could drop a human into a community of communicating animals, and the human would become able to translate it into Human, and you could drop another human, and the two would agree on the translation.

    Apart from that, I agree. The more we learn of animal behavior, the more of what we thought was uniquely human will turn out to be shared with other species. Even language. When we crack that, we’ll also know what sets human language apart.

  29. David Marjanović says

    There’s recent suggestions that South Slavic is not even a thing?

    Not even recent; I was just being sloppy. I don’t know of any South Slavic shared innovations either.

  30. all human languages, and (for what little it’s worth) no known animal language, provide a way to negate a statement.

    Used as a plot device (one of about a million) in The Hundred-Year-Old Man Who Climbed Out the Window and Disappeared.

  31. AntC: I think “social use” was just bad writing, and Dizikes just meant use of language as distinct from an unused capacity.

    I don’t agree that language is necessary for population dispersal. Many other species are capable of long-distance dispersal, some quite dramatically, such as some invasive species.

  32. Naturally, people with Language would spend tens of thousands of years Just Thinking with it before it occurred to anyone that it might help with communication. What could be more plausible than that?

    It sounds plausible to me. Isn’t the problem with the origin of language that you can only acquire a language from someone who speaks or signs it? So how does it bootstrap itself?

    C Baker mentioned home signing, which I didn’t know about. But I’m wondering whether that only happens because the deaf child’s caregivers interpret the initial signs as language and respond to them, and are then in a position to interpret linguistically structured combinations of signs and give the child the desired result.

    For comparison, I imagine that all that’s needed for writing is language and manual dexterity, and maybe the invention of drawing, yet writing wasn’t invented for at least tens thousands of years after the preconditions existed. And then, I think, it was only in certain kinds of cultures, when certain needs arose.

    Is Chomsky’s or whoever’s picture that people were thinking in something like language before they realized that they could use it to communicate? I’d have imagined that there was no language till it was used for communication.

    Maybe there’s evidence from deaf children or some such source that people can learn to use language without contact with any language users, but if not, I don’t see how this could be settled without the Really Forbidden Experiment of raising a large group of children without exposure to language, and if necessary, keeping their descendants isolated from language too.

  33. I don’t agree that language is necessary for population dispersal.

    I’m talking about social species that need ‘infrastructure’/support mechanisms for a viable population. Does any other species than humans have young that are entirely unable to fend for themselves for months/years after birth?

    Other species don’t have such stringent demands; but I suspect have means of sophisticated communication. The language is an ‘enhanced feature’ for what’s already happening. Dolphins/orca moving through Wellington harbour this week.

  34. i don’t think there’s any reason to distinguish home sign from any other kind of lect, whether visual or sonic – it’s just a lect with a very small population of users, most of whom usually have at least one other lect that they use with other people. like any other such lect – which will usually be a deliberately-created one (family-based conlangs come to mind) – it’ll likely owe a lot of its structure to those overlapping lects, but that may change if it survives for more than one cohort of speakers.

    but most importantly, they’re created out of use, in their specific social context, using whatever gestural/expressive materials are available, to enable whatever kinds of communication their users collectively think are necessary or pleasurable. their usual limits are about the decisions the users have made about what those forms of communication are, generally affected by the fantasy that deafness means stupidity. but those are arbitrary, not structural; if a family of mathematicians with a live-in cook developed it, a home sign could have extensive resources for communicating about set theory, but might lack any vocabulary for the kitchen.

    martha’s vineyard sign is probably the classic example of how a local sign, presumably with one or more “home sign” origin points, can become a long-lasting lect extending well beyond its geographic and social origin point, without the kind of institutional context and pidginization/creolization process that gave rise to nicaraguan sign.

Speak Your Mind

*