Son of Yamnaya.

In this 700+-comment thread, which seems to have become a dumping-ground for all DNA-related commentary, Dmitry Pruss said mildly but convincingly:

An ob gripe, I don’t think that it’s the best idea to discuss “everything DNA” in this, already oversize, thread…

So I’m hereby opening this as a continuation. If you have thoughts about genomic components and Denisovan signatures, this is the place for them!

Comments

  1. Trond Engen says

    I did recently link to a paper on Mongolia and the, eh, genetogenesis of the Xiongnu and the Mongolians in They Perished like Avars, where we have discussed much Post-Indo-European Steppe stuff. It didn’t attract any follow-up comments, so feel free to move it here.

  2. David Eddyshaw says

    Will we be seeing “Bride of Yamnaya” in due course?

  3. And possibly eventually Second Cousin Twice Removed of Yamnaya.

  4. Canonically, “Bride of Yamnaya” should have come first. Then “Son of Yamnaya”, then “Ghost of Yamnaya”, then “Yamnaya meets Dravidian”, then “House of Yamnaya”. The final(?) one should probably be a duo of comedians meet Yamnaya — I’d suggest “Nyland and Goropius”, but there is a surfeit of them to choose from.

    /Hammertime

  5. David Eddyshaw says

    Yamnaya in the KONGO. (Perhaps too controversial for these politically correct times …)

  6. Oh, the wind that blew through the whiskers on the flea in the hair on the tail of
    the dog of the daughter of the wife of the Dayak has just come to town….

  7. A new paper by Ioannidis et al., Native American gene flow into Polynesia predating Easter Island settlement is the most careful approach I have seen toward demonstrating early Polynesian-American contact using genetics. The paper finds an American genetic signature in Eastern Polynesian populations. What’s distinguishes this paper from earlier such studies is that it clearly separates the purported American signal from a European one; that it dates both plausibly; and that it clearly distinguishes different coastal American populations, and ties the source of the Polynesian signal specifically to a population in Colombia.

  8. Interesting!

  9. Trond Engen says

    Y: Ioannidis et al 2020

    We briefly discussed it here back in July. I still haven’t read the full text.

    Dmitry (in Mother of Yamnaya): (Huang et al 2020).

    I love it. This seems to take historical genetics to a whole new level, using the sheer power of numbers to shake out genetic commonalities that can be traced back to a common ancestor. The multi-ethno-linguistic matrixes are essentially the comparative method on genomes, but used to identify the oldest common elements rather than to reconstruct a complete ancestral genome.

  10. Trond Engen says

    A few seconds late to edit I meant to add a few random observations:

    They identify a “Northeast Asian Cluster”, which must be more or less identical with what Jeong et al dub ‘Ancient North Asians’ in the paper on the genetic history of Mongolia..

    They identify a gene flow from “European” into “Inland South Asian” (likely including the group speaking Proto-Sino-Tibetan) at ~5800 kA. I wonder where that came from.

    Note the predictive force. They posit a yet unsampled group in a specific location and with a specific genetic signature as the linguistic ancestors of Kra-Dai.

  11. Thanks, Trond. I somehow missed that discussion (and what followed it, which was very interesting, too.)

  12. Trond Engen says

    (This connects to several discussions. Most immediately me (July 29, 2020 at 7:25 am) in They perished like Avars:

    I got the Yu paper (thanks!) and just finishen reading it. Not much time to digest, but my takeout is that it’s a complementation to what we already knew about Eastern Siberia. On the ancient and basal level, it fills out the picture of the North Asian population that is ancestral to Non-Arctic Native Americans. Additionally it starts to untangle the movements and admixtures of the Late Neolithic and Early Bronze Age that eventually would lead to the formation of the ethnic and linguistic groups we know today. The plague is part of that. Intriguingly it’s found in two individuals without Steppe ancestry. They were from the same site, but one of them had migrated in his early childhood. The date and the strain of Yersinia pestis are practically identical to those of a Corded Ware individual from the Baltic. This fits well with the population crisis in Scandinavia before the arrival of the Bell Beakers.)

    A new Siberian archaeo-genetics paper from Dmitry on Facebook:

    Kılınç et al “Human population dynamics and Yersinia pestis in ancient northeast Asia” Sci. Adv. 2021; 7 : eabc4587.

    Their conclusions corroborate and expands on earlier studies:

    Conclusions
    Northeast Asia, particularly the Baikal adjacent area and the entire Russian Far East, presents a complex demographic picture with hitherto unknown genetic shifts since the post-LGM. The Trans-Baikal area displays few genetic turnovers with an extended period of genetic continuity over a period of c.6000 years. This unique demographical pattern throughout the Holocene stands in sharp contrast to the recurrent gene flow events of Cis-Baikal and Yakutia. We document that the human group that was represented by Khaiyrgas-1 must have dispersed to Yakutia after the LGM. This group was genetically distinct from the first inhabitants of the Siberia who settled the area before the LGM. The genetic legacy of this group is visible among human groups in the area ~6000 years later. Our data fit well with Belkachi groups as having key position in the ancestry of Paleo-Inuits who launched the second wave of gene flow into the Americas c.5000 years ago. We also document the presence of the most northeastern occurrence of ancient Y. pestis in the less populated Yakutia region and in the highly connected Cis-Baikal area. The bacterium may well have had consequences in shaping human population dynamics in both regions, visible in the reduction in the effective population size and the genetic diversity levels ~4400 years ago. Consistent with the finding of the same bacterium in the Lake Baikal region during the Bronze Age ([Yu et al 2020]), this finding suggests that a plague pandemic in this part of northeast Asia could be a hypothesis worth exploring with more data. Our results demonstrate a complex demography in northeast Asia from the Late Upper Paleolithic up until the Medieval era in which Siberian populations expanded interacting with each other and with populations from distant geographical areas.

    Apart from that, the paper is unusually hard to digest, but the supplementary text is more readable. I guess they had to edit it down to the available number of pages. Anyway…

    Expanding on Yu et al this study makes it very likely that there was a regional plague pandemic at around the year 2500 BCE. In my view the region could well turn out to be the entire continent, but there’s not enough data to tell yet. A few centuries later the bacterium is found at the Kolyma river in the farthest end of Yakutia. I’d think that to spread in a sparsely populated region like that, this early incarnation of the plague must have been both a slow killer and independent of rats.

    Something close to the source population of Paleo-Eskimos (and hence also of Na-Dene?) is found with the sampling of two 7th millennium BP individuals from the region of Yakutsk in Yakutia, associated with the Belkachi culture and its immediate predecessor, the Syalakh culture. This is in line with earlier hypotheses based on material culture. The two seem to be close to a mid-9th millennium BP individual from east of Lake Baikal, also in line with hypotheses from material culture. All three are said to show genetic affinity with modern Chukotko-Kamchatkans — as is Saqqaq.

    Two 5th millenium BP individuals from the Lena Basin and three 4th millenium BP individuals from the Kolyma River further northeast form a distinct group, apparently descended from Syalakh/Belkachi with additional admixture from southeast. An interesting outlier was left unmentioned in the main text but shown on the maps and diagrams, where it’s intriguingly grouped with the Yakutian individuals. This is a mid-5th millenium BP individual from south of Krasnoyarsk who seems to fit perfectly within the contemporary population in the Lena Basin. Is this the first Yeniseian? This individual also shows evidence of a recent genetic bottleneck, in line with the plague hypothesis.

    The Syalakh/Belkachi cultures are also thought to be ancestral to the Bronze Age Ymyyakhtakh culture that spread almost explosively along the Arctic coast in the late 2nd millennium BCE. The Kolyma individuals are late enough that they could be part of this movement, but I can’t find anything on their cultural affinity.

  13. Trond Engen says

    I’m on record suggesting a Dene-Yeniseian homeland (providing there is such a thing as Dene-Yeniseian) on the Arctic Coast. I’ll note that a mobile riverine culture on the Lena could easily spill over into the Yenisei Basin via the Angara or Tunguska rivers (or vice versa). It’s the distance from there to Alaska that disturbs me. The ancestors of the Paleo-Eskimos (and/or Na-Dene) would have migrated from the Lena Basin long before those of the Syalakh-Belkachi descendants found on the Kolyma River. For the Yeniseian branch to have been brought from the Lena to the Yenisei in the 5th millennium BP, we’d have to suppose that the stay-behind groups on the Lena were Pre-Proto-Yeniseian for a long time, even as new East Asian groups moved into the area and were integrated in its genetic profile. If so, also the movers north should be (Para-)Pre-Proto-Yeniseians. Maybe these coastal Leniseians were yukagrified from the west.

  14. It is a mainstream view in Russia that Yukaghir languages came into region with the Bronze Age Ymyyakhtakh culture in late 2nd millennium BC.

    From Baikal region, but their original homeland was further west, closer to Urals.

  15. Trond Engen says

    Since this is the Great Eurasian Plague thread, I’ll link to

    Julian Susat et al: A 5,000-year-old hunter-gatherer already plagued by Yersinia pestis Cell Reports, 2021

    Summary
    A 5,000-year-old Yersinia pestis genome (RV 2039) is reconstructed from a hunter-fisher-gatherer (5300–5050 cal BP) buried at Riņņukalns, Latvia. RV 2039 is the first in a series of ancient strains that evolved shortly after the split of Y. pestis from its antecessor Y. pseudotuberculosis ∼7,000 years ago. The genomic and phylogenetic characteristics of RV 2039 are consistent with the hypothesis that this very early Y. pestis form was most likely less transmissible and maybe even less virulent than later strains. Our data do not support the scenario of a prehistoric pneumonic plague pandemic, as suggested previously for the Neolithic decline. The geographical and temporal distribution of the few prehistoric Y. pestis cases reported so far is more in agreement with single zoonotic events.

    (Link from Dmitry, as usual)

    The oldest and most basal strain of Y, pestis yet has been discovered in a 5300-5050 cal. BP hunter-gatherer from northern Latvia. Needless to say, the last line of the summary is controversial. But since every instance of the plague is a zoonosis, the controverse is really about whether the plague is spreading as a pandemic among rodent parasites on human society. It’s when Y. pestis becomes pandemic (or endemic) among rodents, and the bacterium gains the ability to infect humans easily, that the zoonosis in humans becomes pandemic by extension.

    Here’s a thought-provoking paragraph from the discussion:

    Modern Y. pestis can be transmitted from animals (e.g., rodents) to humans (Demeure et al., 2019). It is possible that hunter-gatherers, who frequently killed rodents for food or personal decoration, contracted Y. pestis or its antecessor Y. pseudotuberculosis directly from animals. Interestingly, at the Riņņukalns site, beaver (Castor fiber) was the most frequently recorded species among the archaeozoological finds excavated by Sievers (Rütimeyer, 1877). Beavers are a common carrier of Y. pseudotuberculosis, which directly precedes our early Y. pestis strain (Gaydos et al., 2009). Despite this interesting observation, it remains unknown to what degree hunter-gatherers may have played a role in the zoonotic emergence, early evolution, or spread of Y. pestis.

    The question of how a rodent disease could trigger a human pandemic in a sparsely populated region of Eurasia is important, and I’m trying to get my head around it upthread. So maybe the authors are right and the first instances were isolated infections. Maybe, even, that the plague never was pandemic among hunter-gatherers. Still, I think hunter-gatherers with early Y. pestis are significant. As incrisingly deadly ad/or infectious strains of Yersinia spread among wild rodents, hunter-gatherers (beaver hunters?) could have adapted (by non-fatal exposure or genetic selection) before it infected the rats of the agriculturalists, and they would then be in a position to fill the void after the first plague.

  16. @Trond Engen: It seems to me that the terminology with early plague strains is bit tricky—being based, in part, on what appears to have been an erroneous assumption about how the modern plague strain developed (and thus where to place the dividing line between Yersinia pseudotuberculosis and Y. pestis). As I understand it, it is conventional to call all bacterial lineages with the pMT1 and pPCP1 plasmids Y. pestis. However, it is now known from very early genomes (and according to that Cell Reports paper, the Rinnukalns genetic data confirms this) that pMT1 was actually acquired without the gene for the key virulence factor ymt (Yersinia murine toxin). Only later was the gene for ymt, which makes it much easier for the bacteria to thrive inside the flea vectors, added to the plasmid—meaning that assimilation of the plasmid itself was not one of the primary enablers in the development pestis-level human infectivity. Nomenclature will presumably get even trickier if fossil genomes with only one of the pMT1 or pPCP1 plasmids are found (which has not, to my knowledge, been observed thus far).

  17. Trond Engen says

    @Brett: Thanks. I couldn’t have written that, but I agree.

  18. Dmitry Pruss on Facebook linked to these interesting papers:

    The origins and spread of domestic horses from the Western Eurasian steppes:

    Our results reject the commonly held association between horseback riding and the massive expansion of Yamnaya steppe pastoralists into Europe ~3,000 BCE driving the spread of Indo-European languages. This contrasts with the situation in Asia where Indo-Iranian languages, chariots and horses spread together, following the early second millennium BCE Sintashta culture.

    Dairying enabled Early Bronze Age Yamnaya steppe expansions:

    Our results point to a potential epicentre for horse domestication in the Pontic–Caspian steppe by the third millennium bc, and offer strong support for the notion that the novel exploitation of secondary animal products was a key driver of the expansions of Eurasian steppe pastoralists by the Early Bronze Age.

  19. horse domestication … the novel exploitation of secondary animal products

    Is transport an “animal product” ? Or is primarily horse dung meant ?

  20. Aren’t dairy products the reference there — meat being the only use considered primary?

  21. Yes, if one bothers to click through, one finds (bolding added):

    Here we draw on proteomic analysis of dental calculus from individuals from the western Eurasian steppe to demonstrate a major transition in dairying at the start of the Bronze Age. The rapid onset of ubiquitous dairying at a point in time when steppe populations are known to have begun dispersing offers critical insight into a key catalyst of steppe mobility. The identification of horse milk proteins also indicates horse domestication by the Early Bronze Age, which provides support for its role in steppe dispersals.

    Of course, the reference to “dairying” in the title might have been a clue.

  22. Adding a reference to “Horsing around” in the title would have been yet another welcome clue.

  23. Perhaps “dairying” was misread as “draying”, a nonce synonym for “drayage” . . .

  24. It”s the same paper jack morava linked to in another thread recently. I haven’t had time to read it, but I immediately like it. And I don’t know if it rejects the horse hypothesis as much as nuances and complements it.

  25. @Trond Engen,

    I started trying to take this seriously when I saw

    https://en.wikipedia.org/wiki/Trundholm_sun_chariot

  26. David Marjanović says

    Open-access paper on how Japan was settled in three stages.

  27. David Marjanović says

    Five open-access papers and their abstracts:

    Ancient Mitochondrial Genomes Reveal Extensive Genetic Influence of the Steppe Pastoralists in Western Xinjiang

    The population prehistory of Xinjiang has been a hot topic among geneticists, linguists, and archaeologists. Current ancient DNA studies in Xinjiang exclusively suggest an admixture model for the populations in Xinjiang since the early Bronze Age. However, almost all of these studies focused on the northern and eastern parts of Xinjiang; the prehistoric demographic processes that occurred in western Xinjiang have been seldomly reported. By analyzing complete mitochondrial sequences from the Xiabandi (XBD) cemetery (3,500–3,300 BP), the up-to-date earliest cemetery excavated in western Xinjiang, we show that all the XBD mitochondrial sequences fall within two different West Eurasian mitochondrial DNA (mtDNA) pools, indicating that the migrants into western Xinjiang from west Eurasians were a consequence of the early expansion of the middle and late Bronze Age steppe pastoralists (Steppe_MLBA), admixed with the indigenous populations from Central Asia. Our study provides genetic links for an early existence of the Indo-Iranian language in southwestern Xinjiang and suggests that the existence of Andronovo culture in western Xinjiang involved not only the dispersal of ideas but also population movement.

    Genomic Insight Into the Population Admixture History of Tungusic-Speaking Manchu People in Northeast China

    Manchu is the third-largest ethnic minority in China and has the largest population size among the Tungusic-speaking groups. However, the genetic origin and admixture history of the Manchu people are far from clear due to the sparse sampling and a limited number of markers genotyped. Here, we provided the first batch of genome-wide data of genotyping approximate 700,000 single-nucleotide polymorphisms (SNPs) in 93 Manchu individuals collected from northeast China. We merged the newly generated data with data of publicly available modern and ancient East Asians to comprehensively characterize the genetic diversity and fine-scale population structure, as well as explore the genetic origin and admixture history of northern Chinese Manchus. We applied both descriptive methods of ADMIXTURE, fineSTRUCTURE, FST, TreeMix, identity by decedent (IBD), principal component analysis (PCA), and qualitative f-statistics (f3, f4, qpAdm, and qpWave). We found that Liaoning Manchus have a close genetic relationship and significant admixture signal with northern Han Chinese, which is in line with the cluster patterns in the haplotype-based results. Additionally, the qpAdm-based admixture models showed that modern Manchu people were formed as major ancestry related to Yellow River farmers and minor ancestry linked to ancient populations from Amur River Bain, or others. In summary, the northeastern Chinese Manchu people in Liaoning were an exception to the coherent genetic structure of Tungusic-speaking populations, probably due to the large-scale population migrations and genetic admixtures in the past few hundred years.

    Genomic Insight Into the Population Structure and Admixture History of Tai-Kadai-Speaking Sui People in Southwest China

    Sui people, which belong to the Tai-Kadai-speaking family, remain poorly characterized due to a lack of genome-wide data. To infer the fine-scale population genetic structure and putative genetic sources of the Sui people, we genotyped 498,655 genome-wide single-nucleotide polymorphisms (SNPs) using SNP arrays in 68 Sui individuals from seven indigenous populations in Guizhou province and Guangxi Zhuang Autonomous Region in Southwest China and co-analyzed with available East Asians via a series of population genetic methods including principal component analysis (PCA), ADMIXTURE, pairwise Fst genetic distance, f-statistics, qpWave, and qpAdm. Our results revealed that Guangxi and Guizhou Sui people showed a strong genetic affinity with populations from southern China and Southeast Asia, especially Tai-Kadai- and Hmong-Mien-speaking populations as well as ancient Iron Age Taiwan Hanben, Gongguan individuals supporting the hypothesis that Sui people came from southern China originally. The indigenous Tai-Kadai-related ancestry (represented by Li), Northern East Asian-related ancestry, and Hmong-Mien-related lineage contributed to the formation processes of the Sui people. We identified the genetic substructure within Sui groups: Guizhou Sui people were relatively homogeneous and possessed similar genetic profiles with neighboring Tai-Kadai-related populations, such as Maonan. While Sui people in Yizhou and Huanjiang of Guangxi might receive unique, additional gene flow from Hmong-Mien-speaking populations and Northern East Asians, respectively, after the divergence within other Sui populations. Sui people could be modeled as the admixture of ancient Yellow River Basin farmer-related ancestry (36.2–54.7%) and ancient coastal Southeast Asian-related ancestry (45.3–63.8%). We also identified the potential positive selection signals related to the disease susceptibility in Sui people via integrated haplotype score (iHS) and number of segregating sites by length (nSL) scores. These genomic findings provided new insights into the demographic history of Tai-Kadai-speaking Sui people and their interaction with neighboring populations in Southern China.

    Peopling History of the Tibetan Plateau and Multiple Waves of Admixture of Tibetans Inferred From Both Ancient and Modern Genome-Wide Data

    Archeologically attested human occupation on the Tibetan Plateau (TP) can be traced back to 160 thousand years ago (kya) via the archaic Xiahe people and 30∼40 kya via the Nwya Devu anatomically modern human. However, the history of the Tibetan populations and their migration inferred from the ancient and modern DNA remains unclear. Here, we performed the first ancient and modern genomic meta-analysis among 3,017 Paleolithic to present-day Eastern Eurasian genomes (2,444 modern individuals from 183 populations and 573 ancient individuals). We identified a close genetic connection between the ancient-modern highland Tibetans and lowland island/coastal Neolithic Northern East Asians (NEA). This observed genetic affinity reflected the primary ancestry of high-altitude Tibeto-Burman speakers originated from the Neolithic farming populations in the Yellow River Basin. The identified pattern was consistent with the proposed common north-China origin hypothesis of the Sino-Tibetan languages and dispersal patterns of the northern millet farmers. We also observed the genetic differentiation between the highlanders and lowland NEAs. The former harbored more deeply diverged Hoabinhian/Onge-related ancestry and the latter possessed more Neolithic southern East Asian (SEA) or Siberian-related ancestry. Our reconstructed qpAdm and qpGraph models suggested the co-existence of Paleolithic and Neolithic ancestries in the Neolithic to modern East Asian highlanders. Additionally, we found that Tibetans from Ü-Tsang/Ando/Kham regions showed a strong population stratification consistent with their cultural background and geographic terrain. Ü-Tsang Tibetans possessed a stronger Chokhopani-affinity, Ando Tibetans had more Western Eurasian related ancestry and Kham Tibetans harbored greater Neolithic southern EA ancestry. Generally, ancient and modern genomes documented multiple waves of human migrations in the TP’s past. The first layer of local hunter-gatherers mixed with incoming millet farmers and arose the Chokhopani-associated Proto-Tibetan-Burman highlanders, which further respectively mixed with additional genetic contributors from the western Eurasian Steppe, Yellow River and Yangtze River and finally gave rise to the modern Ando, Ü-Tsang and Kham Tibetans.

    The Opportunities and Challenges of Integrating Population Histories Into Genetic Studies for Diverse Populations: A Motivating Example From Native Hawaiians

    There is a well-recognized need to include diverse populations in genetic studies, but several obstacles continue to be prohibitive, including (but are not limited to) the difficulty of recruiting individuals from diverse populations in large numbers and the lack of representation in available genomic references. These obstacles notwithstanding, studying multiple diverse populations would provide informative, population-specific insights. Using Native Hawaiians as an example of an understudied population with a unique evolutionary history, I will argue that by developing key genomic resources and integrating evolutionary thinking into genetic epidemiology, we will have the opportunity to efficiently advance our knowledge of the genetic risk factors, ameliorate health disparity, and improve healthcare in this underserved population.

  28. Thanks for those!

  29. David Marjanović says

    I haven’t had time to read them myself yet.

  30. Trond Engen says

    Thanks! Will read.

  31. Trond Engen says

    I’m frantically trying to get up to date with all these papers.

    The origins and spread of domestic horses from the Western Eurasian steppes has showed up as open access in Nature.

    (Quoting a bit more)

    Abstract
    Domestication of horses fundamentally transformed long-range mobility and warfare. However, modern domesticated breeds do not descend from the earliest domestic horse lineage associated with archaeological evidence of bridling, milking and corralling at Botai, Central Asia around 3500 BC. Other longstanding candidate regions for horse domestication, such as Iberia and Anatolia, have also recently been challenged. Thus, the genetic, geographic and temporal origins of modern domestic horses have remained unknown. Here we pinpoint the Western Eurasian steppes, especially the lower Volga-Don region, as the homeland of modern domestic horses. Furthermore, we map the population changes accompanying domestication from 273 ancient horse genomes. This reveals that modern domestic horses ultimately replaced almost all other local populations as they expanded rapidly across Eurasia from about 2000 BC, synchronously with equestrian material culture, including Sintashta spoke-wheeled chariots. We find that equestrianism involved strong selection for critical locomotor and behavioural adaptations at the GSDMC and ZFPM1 genes. Our results reject the commonly held association between horseback riding and the massive expansion of Yamnaya steppe pastoralists into Europe around 3000 BC driving the spread of Indo-European languages. This contrasts with the scenario in Asia where Indo-Iranian languages, chariots and horses spread together, following the early second millennium BC Sintashta culture.

    […]

    Discussion
    This work resolves longstanding debates about the origins and spread of domestic horses. Whereas horses living in the Western Eurasia steppes in the late fourth and early third millennia BC were the ancestors of DOM2 horses, there is no evidence that they facilitated the expansion of the human genetic steppe ancestry into Europe as previously hypothesized. Instead of horse-mounted warfare, declining populations during the European late Neolithic may thus have opened up an opportunity for a westward expansion of steppe pastoralists. Yamnaya horses at Repin and Turganik carried more DOM2 genetic affinity than presumably wild horses from hunter-gatherer sites of the sixth millennium BC (NEO-NCAS, from approximately 5500–5200 BC), which may suggest early horse management and herding practices. Regardless, Yamnaya pastoralism did not spread horses far outside their native range, similar to the Botai horse domestication, which remained a localized practice within a sedentary settlement system. The globalization stage started later, when DOM2 horses dispersed outside their core region, first reaching Anatolia, the lower Danube, Bohemia and Central Asia by approximately 2200 to 2000 BC, then Western Europe and Mongolia soon afterwards, ultimately replacing all local populations by around 1500 to 1000 BC. This process first involved horseback riding, as spoke-wheeled chariots represent later technological innovations, emerging around 2000 to 1800 BC in the Trans-Ural Sintashta culture. The weaponry, warriors and fortified settlements associated with this culture may have arisen in response to increased aridity and competition for critical grazing lands, intensifying territoriality and hierarchy. This may have provided the basis for the conquests over the subsequent centuries that resulted in an almost complete human and horse genetic turnover in Central Asian steppes. The expansion to the Carpathian basin, and possibly Anatolia and the Levant, involved a different scenario in which specialized horse trainers and chariot builders spread with the horse trade and riding. In both cases, horses with reduced back pathologies and enhanced docility would have facilitated Bronze Age elite long-distance trade demands and become a highly valued commodity and status symbol, resulting in rapid diaspora. We, however, acknowledge substantial spatiotemporal variability and evidential bias towards elite activities, so we do not discount additional, harder to evidence, factors in equine dispersal.

    Our results also have important implications for mechanisms underpinning two major language dispersals. The expansion of the Indo-European language family from the Western Eurasia steppes has traditionally been associated with mounted pastoralism, with the CWC serving as a major stepping stone in Europe. However, while there is overwhelming lexical evidence for horse domestication, horse-drawn chariots and derived mythologies in the Indo-Iranian branch of the Indo-European family, the linguistic indications of horse-keeping practices at the deeper Proto-Indo-European level are in fact ambiguous (Supplementary Discussion) . The limited presence of horses in CWC assemblages and the local genetic makeup of CWC specimens reject scenarios in which horses were the primary driving force behind the initial spread of Indo-European languages in Europe. By contrast, DOM2 dispersal in Asia during the early-to-mid second millennium BC was concurrent with the spread of chariotry and Indo-Iranian languages, whose earliest speakers are linked to populations that directly preceded the Sintashta culture. We thus conclude that the new package of chariotry and improved breed of horses, including chestnut coat colouration documented both linguistically (Supplementary Discussion) and genetically (Extended Data Fig. 8), transformed Eurasian Bronze Age societies globally within a few centuries after about 2000 BC. The adoption of this new institution, whether for warfare, prestige or both, probably varied between decentralized chiefdoms in Europe and urbanized states in Western Asia. The results thus open up new research avenues into the historical developments of these different societal trajectories.

    That is, the rapid spread of Steppe ancestry in Europe wasn’t carried on horseback. It wasn’t even bringing horses along. Horses bred for riding and chariotry came a millennium later, and we may ask what upheavals that caused.

  32. David Marjanović says

    Oddly, the paper on the genetic history of the Tibetan plateau does not cite this one by a reshuffling of the same authors, published half a year earlier in the same journal.

  33. Trond: the contrast between the Indo-Iranian speaking zone and the European zone (=Indo-European minus Anatolian, Tocharian and Armenian) is reminiscent of another such contrast: whereas in Europe there is a layer of place-names that seems to go back to a nearly undifferentiated Indo-European (=Krahe’s “Old European”), nothing like this has been found in the Indo-Iranian zone.

    So, if I may speculate wildly here: perhaps in Europe the original bearers of Steppe ancestry (whatever the original cause(s) of their expansion across Europe may have been) were the first wave of Indo-European speakers and were the ones responsible for “Old European” place-names, with a later wave of Indo-Europeanization in Europe taking place as a result of the domestication of the horse (wholly replacing, partly replacing, or simply heavily influencing, the older Indo-European varieties? All possible. Would it be possible to associate the later Indo-Europeanizing wave with some linguistic innovation(s)/isogloss(es) within Europe? The RUKI-rule, perhaps?). Because in the Indo-Iranian zone the spread of Indo-European was due to the domestication of the horse, Indo-Iranian spread across an area which, unlike Europe, had not been previously Indo-Europeanized.

    Hm. Thinking out loud here…this scenario fits with the linguistic facts well for another reason: Indo-Iranian shows far more non-Indo-European influence (in vocabulary) than any of the European branches of Indo-European, which seems odd since we have much older records for Indo-Iranian than for European Indo-European (All other things being equal, we would expect that the languages attested earlier would show less outside influence).

    BUT…if Indo-Iranian did indeed spread at the expense of non-Indo-European languages, and (later, domestic horse-using) European Indo-European spread over a substrate of (earlier) Indo-European varieties (which may well have formed a dialect continuum, and been mutually intelligible with, the expanding varieties), well, that would explain the contrast, wouldn’t it?

    Okay, enough with the soliloquy, thoughts, anyone?

  34. I like it, but I await what better-informed folks will have to say.

  35. Two potentially important considerations are

    1) Male-dominated, Steppe ancestry-rich population turnovers in Europe didn’t end with the initial spread of Yamnaya across Dnieper and up the Danube. Rather, they just started. A replacement event every 200-300 years all the way to the end of the III millennium BCE. As documented recently e.g. in
    https://www.science.org/doi/10.1126/sciadv.abi6941

    2) Military technologies spread somewhat independently of the main population moves. Take Sintashta. Their principal genetic ancestors were the Fatyanovo who didn’t have horses and didn’t live on the Steppe in the first place. They were grazing sheep and raising pigs on the riverside meadows of the forests. Fatyanovo did form an early offshoot East of Volga since they needed copper ores lacking in their core lands. It overlaps in time with Turganik and its proto-domestic horses, but Turganik is assigned to a very different Samara culture. Even though it’s geographically close to Sintashta which soon flourished. But Sintashta is known to have been something of a melting pot, with the DNA of some of Sintashta remains being quite dissimilar from its main population. So I believe that Sintashta got its horses from one of the Steppe populations which contributed less, genetically, to its population. A sort of a culturally important minority. The horse technology could have spread West in the same way, as a cultural import or a contribution of a minority group.

  36. David Marjanović says

    Leiden press release, pretty long and readable. I still haven’t read the real thing, but I’ll try to do that right away.

    in Europe there is a layer of place-names that seems to go back to a nearly undifferentiated Indo-European (=Krahe’s “Old European”)

    There may not actually be any evidence for this. Check out Piotr’s conference presentation “Against Old European: Why we need to be more specific” from 2012.

    More than one wave of IE language spread in western & central Europe (not to mention the Apennine and Balkan peninsulas with the whole Crotonian business, or the Italic loanwords in Proto-Slavic) are a given, e.g. because the Bell Beaker expansion was simply too early to be Proto-Celtic or anything like that. But those probably all came from within that region; e.g., there’s no reason, AFAIK, to assume the origin of Celtic was any farther east than Bavaria.

    1) Male-dominated, Steppe ancestry-rich population turnovers in Europe didn’t end with the initial spread of Yamnaya across Dnieper and up the Danube. Rather, they just started. A replacement event every 200-300 years all the way to the end of the III millennium BCE. As documented recently e.g. in

    That’s a fascinating open-access article! Abstract:

    Europe’s prehistory oversaw dynamic and complex interactions of diverse societies, hitherto unexplored at detailed regional scales. Studying 271 human genomes dated ~4900 to 1600 BCE from the European heartland, Bohemia, we reveal unprecedented genetic changes and social processes. Major migrations preceded the arrival of “steppe” ancestry, and at ~2800 BCE, three genetically and culturally differentiated groups coexisted. Corded Ware appeared by 2900 BCE, were initially genetically diverse, did not derive all steppe ancestry from known Yamnaya, and assimilated females of diverse backgrounds. Both Corded Ware and Bell Beaker groups underwent dynamic changes, involving sharp reductions and complete replacements of Y-chromosomal diversity at ~2600 and ~2400 BCE, respectively, the latter accompanied by increased Neolithic-like ancestry. The Bronze Age saw new social organization emerge amid a ≥40% population turnover.

  37. Trond Engen says

    Fascinating it is. This opens so many doors it’s hard to know where to go. For now. Soon this fine-grained approach will be applied everywhere, and we’ll have migration history on a level I couldn’t imagine.

    Obviously, the whole process of indoeuropeanization just became very chaotic. I was surprised, but I realize I shouldn’t be. A culture established through full replacement of the male line won’t stop having male lines replaced just like that. The period from the late 4th to the early 2nd millennium BCE was one long migration era, and migrations went in all directions. Funnel Beaker and Global Amphora were intrusive to Bohemia from the north. Then came Corded Ware from the northeast, probably (but not necessarily) introducing Indo-European, Bell Beaker from the west, and then Únětice from the northeast again.

    So why did it calm down for a while in the Bronze Age? Or is the Bronze Age just as restless, and we just haven’t got the data yet?

  38. David Marjanović says

    The genomic origins of the Bronze Age Tarim Basin mummies in open access. Abstract:

    The identity of the earliest inhabitants of Xinjiang, in the heart of Inner Asia, and the languages that they spoke have long been debated and remain contentious. Here we present genomic data from 5 individuals dating to around 3000–2800 bc from the Dzungarian Basin and 13 individuals dating to around 2100–1700 BC from the Tarim Basin, representing the earliest yet discovered human remains from North and South Xinjiang, respectively. We find that the Early Bronze Age Dzungarian individuals exhibit a predominantly Afanasievo ancestry with an additional local contribution, and the Early–Middle Bronze Age Tarim individuals contain only a local ancestry. The Tarim individuals from the site of Xiaohe further exhibit strong evidence of milk proteins in their dental calculus, indicating a reliance on dairy pastoralism at the site since its founding. Our results do not support previous hypotheses for the origin of the Tarim mummies, who were argued to be Proto-Tocharian-speaking pastoralists descended from the Afanasievo or to have originated among the Bactria–Margiana Archaeological Complex or Inner Asian Mountain Corridor cultures. Instead, although Tocharian may have been plausibly introduced to the Dzungarian Basin by Afanasievo migrants during the Early Bronze Age, we find that the earliest Tarim Basin cultures appear to have arisen from a genetically isolated local population that adopted neighbouring pastoralist and agriculturalist practices, which allowed them to settle and thrive along the shifting riverine oases of the Taklamakan Desert.

    Later on:

    In contrast to the EBA Dzungarian individuals, the EMBA individuals from the eastern Tarim sites of Xiaohe and Gumugou (Tarim_EMBA1) form a tight cluster close to pre-Bronze Age central steppe and Siberian individuals who share a high level of ancient North Eurasian (ANE) ancestry (for example, Botai_CA).

    […]

    Outgroup f3 statistics supports a tight genetic link between the Dzungarian and Tarim groups (Extended Data Fig. 2A). Nevertheless, both of the Dzungarian groups are significantly different from the Tarim groups, showing excess affinity with various western Eurasian populations and sharing fewer alleles with ANE-related groups (Extended Data Fig. 2b, c). To understand this mixed genetic profile, we used qpAdm to explore admixture models of the Dzungarian groups with Tarim_EMBA1 or a terminal Pleistocene individual (AG3) from the Siberian site of Afontova Gora, as a source (Supplementary Data 1D). AG3 is a distal representative of the ANE ancestry and shows a high affinity with Tarim_EMBA1. Although the Tarim_EMBA1 individuals lived a millennium later than the Dzungarian groups, they are more genetically distant from the Afanasievo than the Dzungarian groups, suggesting that they have a higher proportion of local autochthonous ancestry. Here we define autochthonous to signify a genetic profile that has been present in a region for millennia, rather than being associated with more recently arrived groups.

    […]

    The Tarim_EMBA1 and Tarim_EMBA2 groups, although geographically separated by over 600 km of desert, form a homogeneous population that had undergone a substantial population bottleneck, as suggested by their high genetic affinity without close kinship, as well as by the limited diversity in their uniparental haplogroups (Figs. 1 and 2, Extended Data Fig. 4, Extended Data Table 1, Supplementary Data 1B and Supplementary Text 4). Using qpAdm, we modelled the Tarim Basin individuals as a mixture of two ancient autochthonous Asian genetic groups: the ANE, represented by an Upper Palaeolithic individual from the Afontova Gora site in the upper Yenisei River region of Siberia (AG3) (about 72%), and ancient Northeast Asians, represented by Baikal_EBA (about 28%) (Supplementary Data 1E and Fig. 3a). Tarim_EMBA2 from Beifang can also be modelled as a mixture of Tarim_EMBA1 (about 89%) and Baikal_EBA (about 11%). For both Tarim groups, admixture models unanimously fail when using the Afanasievo or IAMC/BMAC groups as a western Eurasian source (Supplementary Data 1E), thus rejecting a western Eurasian genetic contribution from nearby groups with herding and/or farming economies. We estimate a deep formation date for the Tarim_EMBA1 genetic profile, consistent with an absence of western Eurasian EBA admixture, placing the origin of this gene pool at 183 generations before the sampled Tarim Basin individuals, or 9,157 ± 986 years ago when assuming an average generation time of 29 years (Fig. 3b). Considering these findings together, the genetic profile of the Tarim Basin individuals indicates that the earliest individuals of the Xiaohe horizon belong to an ancient and isolated autochthonous Asian gene pool. This autochthonous ANE-related gene pool is likely to have formed the genetic substratum of the pre-pastoralist ANE-related populations of Central Asia and southern Siberia (Fig. 3c, Extended Data Fig. 2 and Supplementary Text 5).

    […]

    Although the harsh environment of the Tarim Basin may have served as a strong barrier to gene flow into the region, it was not a barrier to the flow of ideas or technologies, as foreign innovations, such as dairy pastoralism and wheat and millet agriculture, came to form the basis of the Bronze Age Tarim economies. Woollen fabrics, horns and bones of cattle, sheep and goats, livestock manure, and milk and kefir-like dairy products have been recovered from the upper layers of the Xiaohe and Gumugou cemeteries, as have wheat and millet seeds and bundles of Ephedra twigs. Famously, many of the mummies dating to 1650–1450 bc were even buried with lumps of cheese. However, until now it has not been clear whether this pastoralist lifestyle also characterized the earliest layers at Xiaohe.

    Yup, they were already living off cow, sheep and goat milk products (despite not making any lactase). Ephedra = Mormon tea = soma/haoma as already used in the BMAC.

    The Tarim mummies’ so-called Western physical features are probably due to their connection to the Pleistocene ANE gene pool, and their extreme genetic isolation differs from the EBA Dzungarian, IAMC and Chemurchek populations, who experienced substantial genetic interactions with the nearby populations mirroring their cultural links, pointing towards a role of extreme environments as a barrier to human migration.

    In contrast to their marked genetic isolation, however, the populations of the Xiaohe horizon were culturally cosmopolitan, incorporating diverse economic elements and technologies with far-flung origins. They made cheese from ruminant milk using a kefir-like fermentation, perhaps learned from descendants of the Afanasievo, and they cultivated wheat, barley and millet, crops that were originally domesticated in the Near East and northern China and which were introduced into Xinjiang no earlier than 3500 BC, probably via their IAMC neighbours. [IAMC = Inner Asian Mountain Corridor, basically the Tian Shan.] They buried their dead with Ephedra twigs in a style reminiscent of the BMAC oasis cultures of Central Asia, and they also developed distinctive cultural elements not found among other cultures in Xinjiang or elsewhere, such as boat-shaped wooden coffins covered with cattle hides and marked by timber poles or oars, as well as an apparent preference for woven baskets over pottery. Considering these findings together, it appears that the tightknit population that founded the Xiaohe horizon were well aware of different technologies and cultures outside the Tarim Basin and that they developed their unique culture in response to the extreme challenges of the Taklamakan Desert and its lush and fertile riverine oases.

    This study illuminates in detail the origins of the Bronze Age human populations in the Dzungarian and Tarim basins of Xinjiang. Notably, our results support no hypothesis involving substantial human migration from steppe or mountain agropastoralists for the origin of the Bronze Age Tarim mummies, but rather we find that the Tarim mummies represent a culturally cosmopolitan but genetically isolated autochthonous population. This finding is consistent with earlier arguments that the IAMC served as a geographic corridor and vector for regional cultural interaction that connected disparate populations from the fourth to the second millennium BC. While the arrival and admixture of Afanasievo populations in the Dzungarian Basin of northern Xinjiang around 3000 BC may have plausibly introduced Indo-European languages to the region, the material culture and genetic profile of the Tarim mummies from around 2100 BC onwards call into question simplistic assumptions about the link between genetics, culture and language and leave unanswered the question of whether the Bronze Age Tarim populations spoke a form of proto-Tocharian. Future archaeological and palaeogenomic research on subsequent Tarim Basin populations—and most importantly, studies of the sites and periods where first millennium ad Tocharian texts have been recovered—are necessary to understand the later population history of the Tarim Basin. Finally, the palaeogenomic characterization of the Tarim mummies has unexpectedly revealed one of the few known Holocene-era genetic descendant populations of the once widespread Pleistocene ANE ancestry profile. The Tarim mummy genomes thus provide a critical reference point for genetically modelling Holocene-era populations and reconstructing the population history of Asia.

  39. Amazing stuff — early history is getting progressively rewritten.

  40. David Marjanović says

    We’ve very quickly come a long way from nos ancêtres les Gaulois (or, farther east, from pre-Celtic “Illyrians” who “died out” as my sister was still taught). I envy the generation of current kindergarten kids the prehistory chapters they’ll have in their history schoolbooks.

  41. As they will envy the kids in a few decades…

    Even with all the new data pouring in, I fear that there’s no sign yet of a coherent picture crystallizing. It seems like every new discovery scrambles the history afresh. I fear that we’re far off indeed from being able to codify much at the level of children’s schoolbooks.

  42. But that’s good! It’s better to have a complicated, unclear picture than a simple-minded fairy tale.

  43. Of course, but it’s nice to be at a point where we know that we at least have an approximation to what’s going on. It feels like we are not even to the “let’s start filling in the details” stage.

  44. If we taught kids in the kindergarten the real prehistory, I imagine it would look like this.

    Our great-great-great…..grandpa came from the east and raped our great-great-great…..grandma and thus our great and glorious nation was born.

    But not yet! Because another two centuries passed and our great-great-great…..grandpa came from the east and raped our great-great-great…..grandma and thus our great and glorious nation was born again.

    And so on for five thousand years.

    That’s what literally these papers amount to.

  45. David Eddyshaw says

    Alternatively, great-great-great .. grandma may just have preferred men with chariots to pedestrians.

  46. You are absolutely right.

    That’s exactly what would be written in the version for kindergarten kids.

  47. David Marjanović says

    Even with all the new data pouring in, I fear that there’s no sign yet of a coherent picture crystallizing. It seems like every new discovery scrambles the history afresh.

    The impression I’m getting is that we’ve been having a coherent picture for a few years now. Every new discovery complicates it considerably, but generally in ways that make more sense than the simplicity we had before.

  48. Ah, so this thread is where it is more appropriate to discuss these sorts of discoveries. I posted about the Tarim discovery in the Tocharian thread but it wasn’t really noticed there
    https://languagehat.com/tocharian/#comment-4311399

    preferred men with chariots

    sometimes it’s the opposite disconnect between the conventionally accepted history narrative and what the genes say (the rape was always posited but something different must have happened in reality). Like one of the genes I spent decades with is BRCA1, a scourge of early-age breast and ovarian cancers. The Ashkenazi Jewish population has two common mutations in it, one with venerable, likely middle Eastern roots, another much younger in origin and shared in all peoples of Baltic / Eastern European origins, from Denmark to Greece. One can follow the history of the mutation by checking what else in its neighborhood is passed along from parent to child, and building a phylogenetic tree of these patterns. The root of the tree is buried in the post-Roman Great Migrations age somewhere in the Denmark / South Sweden area. The Ashkenazi sub-branch is relatively young, and grows from a larger Polish / Kresy branch, splitting off in the XVII century, give or take a century or two.

    It’s been known for over 15 years, and people immediately jumped to an interpretation: this genetic scourge must be a lasting legacy of the Chmielnicki rapes. Except the more recent studies show that the massive infusion of East Slavic DNA into the Ashkenazi Jewish population was female-mediated, and took place largely in the 1500s as small founder groups of the Ashkenazim moved to Mazovia and on to the Lithuanian Grand Duchy. Most parsimoniously, it’s the pioneering migrants taking local wives, in a pattern repeated by so many peoples in so many places and circumstances across the globe. Here goes the rape story…

  49. I imagine that in those days a Jewish man could marry a gentile woman without her ritualistically converting, and it would be overlooked, and a few generations later none would be the wiser.

  50. Dmitry Pruss says

    It is safer to assume that conversions did occur, but the process might have differed in some important details from today’s extremely high bar. The rabbinical authorities, including some even in Israel, already consider mitochondrial DNA tests when making halachaic decisions. Their favorite mtDNA haplotypes are the most common among the Ashkenazi Jews but are, ultimately of Western Mediterranean origins, rather than from the Middle East. So these “authentically Jewish” mtDNA lineages are undoubtedly of a female convert origin too, only from an earlier era. By extension, the additional Eastern Europe-specific mtDNA types ought to be considered convert too. Since they’ve been added to the population DNA more recently, they can’t be used to tell Jewish vs. gentile maternal lines apart, but it’s the only difference.

    Very few Ashkenazi maternal lines can be traced by DNA back to the Middle East. Only the paternal lines lead there with few exceptions.

  51. Could those W. Mediterranean lineages be Roman/Italian?

  52. embroidering on @Y:
    /hops on hobby horse/

    i don’t think it’s a matter of “overlooking” – i think it’s part of a very pervasive pattern!

    to my eye, data like this helps strengthen the argument that jewish communities that emerged in earlier periods did so in quite similar ways to the more recently-established ones that we know a great deal about (from the abuyadaya back to the beta israel). which is to say, primarily through non-institutional affiliation, in many cases with the presence of a small number of jewish people (mostly men) from other places, but not necessarily any (as with the abuyadaya). those communities have then had varying degrees of harmonization of some parts of their practice to neighboring or influential older communities, often in many waves over time.

    we know (from daniel boyarin’s research, among others) that the different gender systems and dynamics of gendered power in jewish communities made them desirable points of affiliation for women, in particular, in the period when rabbinic jewishness was emerging. that seems likely to have also been true in the later stages of the christianization of eastern europe (which came especially late to the baltic, of course) – though i’d guess that as in the red sea basin and elsewhere, the chance to at least partially opt out of the structuring conflicts between empires and actively proselytizing religions was a bigger factor.

    this process of ethnogenesis through affiliation is also, i’d say, the easiest way to explain why jewish communities not only look like their neighbors but generally* speak the same languages, or ones that have only recently diverged, with minimal substrates, and why linguistic and minhag/nusakh boundaries line up so precisely. if there were in fact an ancient-greek-style colonial process, you’d expect to see a patchwork of reappearing languages (or at least substrates) and minhogim based on where the founding populations came from. and we don’t, except in some exceptional cases – either communities that remained quite small (the sefardi community in madras; the baghdadi community that began in surat; &c) or were established through mass displacements or migrations (the sefardi diaspora; the yiddish emigration wave) rather than settlement by small core groups.

    .
    * yiddish is the most confusing exception, though the 19thC evidence of 18thC communities speaking something slavic is suggestive about the date that it became the dominant jewish language in eastern europe. and, more importantly, we have very little evidence of what anyone but rabbis and other wealthy men who dealt with the state authorities spoke. and those guys were/are a highly endogamous elite class – one that does seem to have been established by migrants and is often explicitly contrasted with the majority of yiddish jews. the other exceptions (judezmo/ladino being the main one) all have pretty clear explanations.

  53. Could those W. Mediterranean lineages be Roman/Italian?

    Possibly. There is no 100% certain answer, because ancient DNA from the region isn’t sampled well enough, and because the Ashkenazi founder effects may have enriched some previously rare lineages just by chance, potentially reducing the odds of finding similar mtDNAs in the ancient samples.

    A classic paper tries to stay vague
    https://www.nature.com/articles/ncomms3543

    On the up-to-date trees combining literature and user-submitted DNAs, it looks like the roots of the lineages date back to Roman times, not later than II c., but without any ancient DNAs in the comparisons. One example:
    https://yfull.com/mtree/K1a1b1a/

  54. @rozele interesting, 1700s and 1800s Slavic usage evidence, can you please elaborate? We discussed 1600s about a year ago here, it sounds like a very exaggerated interpretation of a quote of a prescriptivist rabbi…

    https://languagehat.com/lost-yiddish-words/#comment-4049403

    There is talk about even earlier Canaanitic but it seems to be smoke without a fire too

  55. David Marjanović says

    Oh, yes, I overlooked the Tocharian thread.

    we know (from daniel boyarin’s research, among others) that the different gender systems and dynamics of gendered power in jewish communities made them desirable points of affiliation for women, in particular, in the period when rabbinic jewishness was emerging.

    Intriguing. Could you elaborate?

  56. rozele, I join Dmitry Pruss in asking for more details. How is that possible that low class Jews in Russian empire in 19th century have spoken Yiddish, but before that a Slavic language? This makes no sense to me, which is not the same as being wrong, of course.

  57. i’ll see how much i can post tonight, and get to anything that’s left later!

    first, @DM:

    all of this adapted from daniel boyarin [i had thought Border Lines was mainly what i’m remembering, but i looked at my copy and it seems to be something else – i’ll see if i can figure it out]…

    rabbinism emerges during the centuries on either side of christian year 0 – first as a countercurrent to temple-centered judean ritual practice, and then as its successor – through the emergence of the studyhouse/synagogue as an institution, the development of the texts that now get grouped as the talmud, and the decreasing power of the hereditary temple priesthood and monarchy. all of this largely* under roman hegemony, which is to say within a society whose gendered structures (which become christianized to form the core of the contemporary euro/colonial gender system) leave barely any space for women to have a public role, much less economic or institutional power.

    rabbinism develops a whole other gender system [this is boyarin’s Unheroic Conduct and Carnal Israel], which is equally binary and committed to privileging men, but very different in content. one of those differences is an openness to women’s active participation in economic and institutional life. if i remember right, this shows up in the early synagogue epigraphy, as well as being quite clear from the polemical writing by writers affiliated with both of the gradually disentangling christian and (rabbinist) jewish movements.

    during this same period, there’s a wave of affiliation with what we can start to see as jewishness, partly enabled by the way that rabbinism makes it exist as something other than “what judeaens do”. there’s enough of it that the rabbis make a ritual for it, and formally decide (though arguments last into the early modern period**) that by affiliating, a person becomes a descendent of abraham just like any other jew [this is s.j.d. cohen]. that wave is, to my eye, what makes something we can properly call jewishness out of the constellation of judean emigrant communities.

    it’s not much of a stretch to see these two processes as connected. if the question is “why become jewish?”, the answer has to be the things distinguishing jews from other segments of roman society. and one of the big ones is their gender system, and the greater space of possibility it offered women.

    that analysis is reinforced, i think, by the way that the same dynamic played into the expansion of early christianity through women’s affiliation [this i think is well-established]. i won’t even glance at the whole question of how to even distinguish that from the expansion of jewishness in the same period – the point is that they’re largely one and the same. the way paul and his successors combined attacks on women’s influence with an insistance on the non-jewishness of christianity shows quite clearly that they understood the two to be related.

    * i know less about what’s up in the persian imperial zone, the other main center of rabbinic jewishness’ emergence. i’m not gonna open that can of worms, except to say that i think it’s arguable that the rabbis’ ideological decision to present themselves as the heirs of the jerusalem temple put the emphasis on the roman sphere even as the mesopotamian rabbis became authoritiative beyond the persian empire.

    ** relevantly, once they exist, rhineland and then eastern european rabbis are very consistently on the ‘pro-convert’ side in their decisions, in contrast to iberian and north african rabbis.

  58. @DP, @DO:

    this is a whole kettle of worms, and nobody has any decent answers. what’s clear is that the conventional explanations for (as of the 19thC) both the presence in eastern europe of the world’s biggest jewish population and its use of a germanic language don’t hold up. there were no mass migrations to the east; weinreich’s rhineland hypothesis doesn’t hold up to scrutiny; western yiddish probably isn’t monogenetic (though the whole family of yiddishes probably is)*; and so on and so forth.

    and, even worse, the jewishness of the khazars is both historically unclear and probably irrelevant!

    but: the piece of the puzzle i was referring to is this:

    R. Isaac Ber Levinsohn (1901, 33–34 note 2; from Hebrew), who lived from 1788 to 1860, was also of the opinion that Russian Jews, in the first instance, spoke Russian:

    “and our elderly told us that the Jews, a number of generations before us, only spoke the language of this Russia in these districts [Volhynia, Podolia, Kiev, and the other districts], and that the Yiddish that we now speak had not been disseminated among all the Jews who lived in these districts, and they also told me, on behalf of […] (Czacki) […] that some hundreds of years ago, the Jews in these districts said their prayers in Polish and not in the holy language as they are used to do nowadays, and this as proof that their language was Polish or Russian”

    this is jits van straten in his tendentious The Origin of Ashkenazi Jewry, whose analysis i wouldn’t rely on, but who i’ve got no reason to distrust on his sources (he’s no paul wexler). levinsohn was an anti-yiddishist enlightener, so we probably have this tidbit thanks to his excitement about finding evidence of eastern european jews using something he could consider a “real language”. but there’s no reason for us to think (as van straten seems to) that the language his informants were talking about was what we’d call russian. if “this russian” is a decent translation of levinsohn’s hebrew, it seems much more likely to be referring to whatever the local slavic vernacular(s) were at the time than to any state language.

    but what we can glean is that at the edge of living mid-19thC memory, the jews of a core region of eastern europe were speaking something slavic rather than yiddish, and that some time before that they had also prayed in a slavic language*.

    this harmonizes pretty well with what alexander beider writes: “The Yiddish literature of Eastern Europe known to us dates from the 16th century only and comes from Poland. No Yiddish publication from the territories of modern Ukraine, Belarus, or Lithuania is known even for the 17th century.” [Contested Origins of Eastern European Jewry: Clues from History, Linguistics, and Onomastics – in Avotaynu 33:2, 2017]

    and even for poland, the yiddish material is quite sparse into the 17thC. one of the earliest eastern yiddish texts – and possibly the earliest printed one by well over a century!*** – is a 1613 refue / remedy book, seyfer derekh ets ha-khayim, that ewa geller says shows that its writer was a fluent polish speaker, while “taytsh, on the other hand, seems to have been an acquired language for the author”.

    i’ll leave it at that, with no pretense of being able to make it all make sense!

    .
    * i’m trusting alexis manaster ramer on this, partly because he actually talks about methodology in ways that make sense.

    ** the time period here is uncertain; oral testimony of “some hundreds of years” could mean anything past living memory. and what exactly “said their prayers” refers to here is an interesting question. it could be anything from synagogue liturgy – which seems quite unlikely, though not impossible given how much everyone seems to agree that there weren’t many rabbis in eastern europe – to some or all of the many genres that were later mainly performed in yiddish.

    *** i’m not gonna go looking further, but geller’s article cites another refue-bukh from 1790 as the previous “first literary document of Eastern-Yiddish”.

  59. David Marjanović says

    Interesting indeed, thanks!

    western yiddish probably isn’t monogenetic (though the whole family of yiddishes probably is)

    I don’t get this part, though: how can the whole family be monogenetic when its parts aren’t?

  60. Interesting indeed, thanks!

    Seconded!

    this is jits van straten in his tendentious The Origin of Ashkenazi Jewry

    I copyedited that book! (He thanks me nicely in the acks.) I too found it tendentious and wondered how much of it to take seriously, but he’s a great guy (and paid me promptly).

  61. PlasticPaddy says

    @rozele, dm, dp, do
    What I am missing here is what different authors mean by “said their prayers”, “spoke the language”, etc. These are multilingual communities with exogamy and transnational or even transcontinental networks involving both Jews and non-Jews. So a significant proportion of people might have been code-switching and what language they “spoke” or “said their prayers” in might depend on the individual and the context. So could Yiddish be a sort of lingua franca adopted over time and in parallel with the rise of nations in Europe?

  62. So could Yiddish be a sort of lingua franca adopted over time and in parallel with the rise of nations in Europe?

    That’s what it’s sounding like, and it’s an exciting idea (to me, anyway).

  63. Since Ashkenazi Jews are of West German origin (there can be no doubt about it), it means that they originally spoke some other West German dialect (not Yiddish).

    They lost it within few generations after migration to Grand Duchy of Lithuania in 15th century, switched presumably to some version of Old Ruthenian and then reacquired another West German dialect which became Yiddish sometime in 18th century.

    Or alternatively (and perhaps more plausibly) this is what happened to some groups of Lithuanian Jews, not all of them.

    So some isolated groups of Jews who lost Yiddish, spoke Ruthenian and even prayed in Slavic were later reintegrated back to the Yiddish-speaking majority.

    Anyway, sounds very interesting.

  64. January First-of-May says

    I sadly don’t recall the details – I’ve heard about it about two or three years ago – but supposedly one Eastern European… either language or highly divergent dialect… is attested from one word list (and/or a few sentences), written in a notebook found by a Soviet teenager in the 1970s, which was then reportedly thrown away by his mother, but fortunately some parts of the text had been copied.

    There’s a lot we don’t (and, often, can’t) really know about the linguistic landscape of Eastern Europe even as recently as the 19th century, never mind the 18th. So much just either didn’t happen to be written down, or was lost later.
    I guess in a lot of cases there’s always the hope for some accidental further discovery…

  65. @DM:

    quickly as i run to work:

    manaster ramer argues that all yiddishes do share a common origin, but that the major/earliest historical splitting is not between the varieties have been clustered (on reasonable linguistic grounds) as “western yiddish” and “eastern yiddish” – roughly west and east of the oder – but between varieties east and west of the elbe (or a line west of it). so the eastern dialects of “western yiddish” are more closely related to the dialects of “eastern yiddish” than they are to the western dialects of “western yiddish”.

    here’s the paper.

    this matters more than it may seem because the big cultural and community-identity division in central & eastern european jewry runs along the elbe, and very much not the oder. proverbially, minhag ashkenaz ends, and minhag poyln begins, at the Dammtor in hamburg. one of the big implicit arguments for the idea that eastern european jews can be considered a subset of german jewry – i.e. as “ashkenazim”, despite having a distinct minhag and nusakh from the rhineland-centered minhag ashkenaz – is the idea that even if yiddish isn’t a pure dialect continuum, its western branch includes a sizeable section of poyln along with ashkenaz. if manaster ramer’s right, that simply doesn’t hold up.

    and that speaks to SFReader’s proposals, too. pace SFReader, there’s metric tons of doubt about the rhenish hypothesis, and always has been. dovid katz, alexander beider, and others have pointed out the shakiness in the – very few – concrete arguments weinreich made to back up his assertions, and i don’t know that anyone has actually tried to make a well-grounded case for it since (as opposed to simply using uncle max’s name as an amulet). but ultimately, how much that matters depends on what we’re trying to explain. west-of-the-elbe yiddish – the yiddish of minhag ashkenaz – may indeed have a rhenish center point. but that doesn’t say much of anything meaningful about the history of east-of-the-elbe yiddish – the yiddish of minhag poyln – which is, after all, the only piece that’s confusing! west of the elbe, it makes sense for jewish communities to speak a germanic language; east of the elbe, not so much, since (again pace SFReader) there’s no evidence for mass migration east.

    (the question of when and why eastern european jews started to be talked about as a subset of german jews (as “ashkenazim”) is its own whole can of worms – all i’ll say here is that my bronx-born grandfather’s line on his jewish identity was “whaddya mean, ashkenazi? i’m a galitsianer!”)

  66. rozele, thanks for a detailed explanation and good for your grandfather to be a proud galitzaner (usually, not the most envied Jewish subsubidentity, but I remember in Munich the movie another proud galitzaner denouncing a “yekke potz”). Even if the gradient hypothesis (from Rhine eastward) is correct, there is no reason for religious and language boundaries to coincide.

  67. The spread of Yiddish among Central and Eastern European Jews does seem oddly similar to the nearly contemporaneous spread of Ladino (A.K.A Judeo-Spanish, A.K.A. Judezmo) among Jews of the Mediterranean world and the Balkans: and I do recall one monograph on a variety of Ladino which made a strong case that Ladino had spread, in hellenophone Europe, at the expense of whatever variety/varieties of Greek Jews of the region had originally spoken. The discussion upthread on Yiddish having possibly spread, among Jews, at the expense of Slavic in Russia and Poland reminded me of this, and more broadly makes me suspect that the dynamics of Ladino and Yiddish language spread may have more than a few points in common.

  68. Well, in case of the Greek Jews and the spread of Ladino, there’s a clear case of swamping — the number of Jews in present-day Greece and surrounding areas immediately prior to the 1492 expulsion was tiny compared to the the exiled newcomers.

    On the other hand, rozele says there was no large-scale eastward migration of Yiddish speakers.

  69. Trond Engen says

    Dmitry: Ah, so this thread is where it is more appropriate to discuss these sorts of discoveries. I posted about the Tarim discovery in the Tocharian thread but it wasn’t really noticed there

    I’ve very much noticed it, but I haven’t had time to dive into it. Will do.

    Since these were ANE people related to the Botai, here’s a speculation on Botai and Mongolia in what for a while was the horse thread. The rest of the discussion needs updating in light of the new dating of riding.

    (LanguageHat: Where everything is discussed but never under the right headline.)

  70. I think there are genetic studies which prove West German origin of all Ashkenazi Jews.

    It’s also perfectly compatible with idea of no mass migration eastwards – all millions of Eastern European Jews are descendants of a very small original founder population – numbering in low hundreds in 14th century.

    Kind of like Quebec, really.

    IIRC, all 9 or 10 million Franco-Canadians and Franco-Americans are descendants of about 800 young French women sent by king to Canada between 1663 and 1673.

  71. Lars Mathiesen says

    Headlines? What headlines?

  72. David Marjanović says

    I’ve read it all now. The paper is from 1997. Manaster Ramer made three claims:

    1) the most conspicuous division in Yiddish is not between “Western” and “Eastern” (of the Oder, very roughly), but between “Westerly” and “Easterly” (of the Elbe or even farther west); Western is therefore paraphyletic with respect to Eastern.
    2) There’s little evidence on whether Westerly Yiddish is monophyletic, though he prefers to think so for the moment and calls for further research.
    3) Yiddish as a whole is monophyletic, as shown by “a very large number” of mostly lexicosemantic innovations with regard to German and Hebrew/Aramaic, notably including both a few Romance and a few Slavic words.

    Much more recently, Beider has agreed with 1) (unfortunately, as Manaster Ramer pointed out in a recent paper, keeping the terms “Western” & “Eastern” and applying them to Westerly & Easterly), but not with 3) or the stronger form of 2): Easterly is not descended from Westerly but has a separate origin (and I think he also offers the possibility that Westerly itself might be polyphyletic, but I haven’t read his book), and the innovations found all across Yiddish must have spread between the mutually intelligible Yiddishes later. I think we have to assume that for the Slavic ones anyway. In particular, it seems to me that khotsh(e) “although” has a specifically Polish form, not Czech.

    On 3), Manaster Ramer presented what he called a small sample “for lack of space (indeed, it seems to call for a monographic treatment).” Has anyone made one? He himself seems not to have – I can’t find any on his very, very long academia.edu page*. Of the three “in press” papers he promised, I can only find one, Yiddish origins: the Austro-Bavarian problem by him and Meyer Wolf (also from 1997). Not having read Beider’s book I don’t know what Beider says about any of the examples.

    But I can say a few things about both of Manaster Ramer’s papers anyway. 🙂

    In sections 3 and 4 of the first paper, Manaster Ramer talks about the megamerger of MHG ei, öu, ou** as /aː/, which has often been taken as the defining phonological feature of Western Yiddish. In section 4, he quotes Beranek (1961 apparently) as saying that it spread across Western Yiddish later, and that it became popular not only because it’s a simplification but also because it was present in “the language of” Frankfurt, the Sudetes and Austria. It’s not present in Austria. MHG ou is indeed /a/ (length is no longer phonemic), but ei is /a/ only in the east (e.g. Vienna) and along the western railway, and the (long unrounded) öu has merged not with the old ei, but with the new one, i.e. MHG î – as in Swiss and Alsatian Yiddish interestingly.

    “Beranek (1965:8, 10) reports scattered traces of relic forms with /eː/ for E4 [MHG ei] and similar traces of relicforms with /oː/ for O4 [MHG ou] in Yiddish dialects of the Rhineland, an area which in general, like all of Western Yiddish, has /a:/ in both cases.” Those are the Low and Central German outcomes.

    The report of [ç] in Swiss Yiddish is fascinating, because Swiss German lacks [ç] and even [x] entirely; it famously has all [χ] all the time, no allophony, like Eastern (or apparently all of Easterly) Yiddish.

    The report, in the same paragraph, of an actual separate /ç/ that does not depend on the preceding vowel in unspecified Westerly Yiddish varieties (“as is well known”; apparently up and down the Rhine) is downright mind-blowing.

    At the end of section 4 is the claim that there’s no “/ç/” in Bavarian. There isn’t, but Manaster Ramer doesn’t seem to use brackets anywhere in the section, and in footnote 13 he says more or less the same dialects “have no ich-Laut”; [ç] very much exists (in Bavarian other than Tyrolean, and in Swabian but not the rest of Alemannic) as the usual allophone (spectrum of allophones really).

    Near the end of section 5 there’s a paragraph on different reflexes of “2”. These could be complicated by the fact that German used to distinguish all three genders by different forms of that word – into the 17th or 18th century in written works (zween, zwo, zwei), and apparently still in a few dialects in Bavaria.

    Section 8 (iii) presents “a large set of additional characteristically Yiddish vocabulary whose etymologies and/or particular semantic, morphological, or phonological developments are specifically Yiddish”, starting with shmeykhlen “smile”. I wonder if that’s a mixture of schmeicheln “flatter” and śmiech- “smile”… Also, horkhen for “hear” isn’t as surprising as its membership in this list makes it seem. Bavarian generally has taken the contrast of sehen “see” (involuntarily) and schauen “look” (deliberately), consolidated it to the point that zusehen “watch” is considered a contradiction in terms and replaced by zuschauen, and extrapolated it to involuntary hören vs. deliberate horchen and zuhorchen. For shabeyse-nakht(s) I’m not sure what the unexpected innovation is supposed to be: “night” for “evening”? My grandma’s way of saying “in the evening” is literally “onto the night”.

    Footnote 40 talks about “the lengthening of vowels in certain stressed final syllables (i.e., usually monosyllables)” as a “possible Proto-Yiddish phonological development”. Guess what? All of High German has lengthened the vowels of monosyllabic words that end in less than two consonants.

    The other paper is off to an odd start: “the Austro-Bavarian dialects, which once covered not only Bavaria and Austria but also parts of the Czech lands and Hungary”? Hungary in the pre-1921 sense, yes, but barely since then.

    Section 2 repeats the gaffe about öu & ei.

    Section 4 teaches me about double diminutives in Bavarian. This is not sarcasm; if it’s in Schmeller’s grammar from 1821, it’s probably real and has merely been lost widely. It also looks like it explains a few things about the weird behavior of -/l̩/ and -/ɐl/ in my dialect (basically, words mostly take one or the other without rhyme or reason, and -/ɐl/ triggers phonological phenomena that only make phonetic sense for -/l̩/). – I didn’t know about the Westphalian island with ink for euch either, nor about 17th-century grammars of Yiddish (both mentioned in section 6), nor about the -s plurals in East Central German (section 8).

    Interesting that vowel length in Courland Yiddish was apparently lost very late (section 5), but not surprising: in Latvian, vowel length is srs bzns.

    The statement (still in section 5) that vowel length is not phonemic in Austro-Bavarian is accurate (except for a few South Bavarian dialects, where it remains phonemic when a consonant cluster follows – long consonants don’t count). But the description of the phonetic situation is really muddled, and not just because slashes are used instead of brackets throughout. The key to understanding is the fact that consonant length has been phonemic ever since Proto-Germanic at the very least, and that the affricates, even though they don’t participate in the inherited length contrast, are by default long in postvocalic position both by their historical origin (from long plosives) and by current phonetics.
    I’ll stick with the noggin example from the paper.
    It starts with More or Less Classical Latin cuppa [kʊpːa]. Either get that into Germanic very early (before Proto-Northwest Germanic; Varus could have done it), or wait for Early Romance to give it to early West Germanic***. Either way, once the Empire is over and the High German consonant shift dawns, it has become *[kʰopːʰ]. It shifts straightforwardly to *[kxopːf] (I’m omitting the tie bars for simplicity).
    Immediately thereafter, OHG passes a total ban on word-final long consonants. Affricates are treated as word-final as a unit, but the long part is the stop, not the fricative – so the stop is shortened. Result: *[kxopf] in the nominative singular.
    By whatever analogy, this word acquires an *-i plural with umlaut. Before OHG is even over, in the last third of the 11th century, the short rounded front vowels are already unrounded**** in Bavarian as I just learned, so, by the time MHG begins in the middle of the following century, the plural is *[kxepːfə].
    Then, the intervocalic three-way contrast of /d t tː/ – all voiceless, all unaspirated – breaks down in Central Bavarian just as MHG officially begins (and South Bavarian implements the “NHG diphthongization”). As a byproduct, all word-final postvocalic fortis consonants are shifted to their lenis counterparts if they have any. Affricates are again treated as word-final as a unit, but only the stop part is shifted, because the fricatives never participated in the fortis/lenis contrast.***** Result: *[kxob̥f] vs. unchanged *[kxepːfə].
    Toward the official end of MHG in the mid-14th century, apocope hits. (Or, perhaps more likely, a general loss of /ə/ in most environments.) At this point, the ban on word-final long consonants is lifted, and we get a phonemic contrast for word-final affricates: singular *[kxob̥f], plural *[kxepːf].******
    Finally, /kːx/ is deaffricated in Central Bavarian, and at some point the vowel lengthening of monosyllabic words ending in less than two consonant moras is introduced from Switzerland. /bf/ counts as one, /pːf/ as two or more, so the modern forms result: [koˑb̥f], [kepːf].
    And then Anatoly Liberman comes in, notices the vowel length but not the consonant length, and goes mad from the incomplete revelation, gibbering about “nominative lengthening” as an utterly eldritch morphological process. Apparently the same happened to Manaster Ramer & Wolf’s source, Жирмунский (1956), and maybe even to the mighty Kranzmayr. But I digress. 🙂

    In section 7, about “word”-final “devoicing” and its absence, it is stated that “Yiddish generalized the devoiced variant in many nouns ending in -nt”. The examples given, hunt & hant, don’t illustrate this, because they had nt in OHG before final fortition showed up in MHG, coming quite regulary from *-nd as the English cognates – hound, hand – show. Either the Central Bavarian postvocalic lenition process also operated after nasals (it did not after /r/; coda /l/ no longer existed), or Bavarian participated in the weird and never mentioned shift of nt, lt to nd, ld that happened in the ancestry of Standard German, which looks like straightforward voicing except there’s still no voice in Bavarian. So, to find out whether “Yiddish generalized the devoiced variant”, we need to find words that had nd in OHG, from *nþ. But Jugend “youth” and Tugend “virtue” have exactly the kinds of abstract meanings that I’d expect Yiddish to lexify from Hebrew… did it? – Anyway, I’m still not sure whether the MHG final fortition ever covered all of Bavarian. The forms cited from Kranzmayer (1956) as showing fortition clearly do so, but they’re equally clearly South Bavarian, and that group has a wide east-west spread…

    The conclusion of the second paper seems fully compatible with Beider’s scenario: Easterly Yiddish starting as a Bohemian koiné composed in about equal parts of Bavarian and East Central German, Westerly Yiddish without any direct Bavarian contribution.

    * So many puns and variations on Wörter und Sachen
    ** Those are the 19th-century normalizations anyway; öu is probably spelled ou throughout the manuscripts or nearly so, and must at some point have been entirely front & rounded, so öü would have been a better choice of normalization.
    *** Actually, that way I could explain why the word is masculine: after WGmc had lost the m. nom. sg. *-z, *-a was suddenly a masculine ending. There’s that famous comb with kaba on it that shows the *-a- did not drop out first.
    **** What a fascinating text (transcribed in full in the article). Check out megi, leski, wirdiglihen, modern Standard möge, lösche, würdig-. I think the long ones & diphthongs are all still rounded, though; they seem to be spelled uo, o, oi, iu and maybe u.
    ***** If you don’t like that hypothesis, I can offer others, but this seems the simplest.
    ****** This becomes a productive pattern, e.g. Tisch – Tische [tɪˑʃ] – [tɪʃː], Fisch – Fische [fɪˑʃ] – [fɪʃː] have a specially created short /ʃ/ from what was still a consonant cluster /sk/ in OHG. Schuh – Schuhe, both [ʃʊɐ̯xː] with long /xː/, must have joined this pattern before leaving it again; otherwise both forms would have had to keep their etymologically short /x/. Note that extra consonants don’t get in the way, e.g. Strumpf – Strümpfe [ʃtʀʊmb̥f] – [ʃtʀɪmpːf].

  73. thanks so much for that deep read, DM!

    i don’t have the chops to even start evaluating the (quite friendly) disagreements between beider and manaster ramer on the technical merits (having read most of beider’s book, and a lot of what manaster ramer’s put on academia.edu). on a gut level, i lean away from beider on purely aesthetic grounds: he seems to both want a clear and fairly unitary answer and to think he’s found one, which makes me doubly skeptical.

    leyzer burko’s review of beider’s book may be useful for folks who want to keep on going down this rabbithole; it mostly deals with the bohemian and east franconian parts of the argument (with some criticism of beider’s depiction of bohemian as a fairly unified dialect).

    o, and off the top of my head i don’t know a cognate for Tugend, but standard yiddish does take ‘youth’ from the germanic side: יוגנט / yugnt (i don’t know how old the word is in the language, though); the hebrew/aramaic-origin words in that semantic zone that come to mind are all more specific (בתולה / bsule ‘maiden, virgin’ and such).

  74. David Marjanović says

    Oh, that article! It featured prominently in this thread a few years ago. 🙂 I commented – and conclude that Standard German and Easterly Yiddish may very well continue different registers of late medieval Prague German that went in different directions (geographic and otherwise).

  75. That’s an interesting thread. Dmitry Pruss said:

    In places where Jews were in regular contact with a German-speaking majority, their Yiddish naturally picked up features of the local characteristic

    Not just German. Occasionally, Slavic as well? Alexander Beider explained that the phonetics of the Litvak Yiddish are lifted from Polish dialects of Mazovia (where the Jews were especially few in numbers when they moved, or were expelled to, Lithuania in the 1500s). Since the DNA tells us that the early founders disproportionately took local wives, then the hypothesis may be that the outsize influence of the Mazovian Polish may have been mediated by intermarriage…

  76. >Oh that thread…

    More proof that you guys have forgotten more than I’ll ever know about this stuff, despite my interest.

    It is funny how many threads reach the “Oh, yeah, we did discuss that” moment. I think largely a sign of the range and depth of what is said, rather than oncoming senility. The latter, or at least a senescent inability to remember as large a volume of data, may be an issue for me though.

  77. David Eddyshaw says

    All topics are discussed somewhere by LH. However, despite JC’s best efforts, some of the conclusions cannot now be retrieved in polynomial time, and are inaccessible to all currently known search methodologies. Nevertheless, we believe that they exist.

  78. Credo quia inaccessibile.

  79. We are all monkeys with typewriters here.

  80. Auto-correct vastly reduces the number of monkeys you have to put in front of keyboards to have a shot at producing Shakespeare.

  81. Dmitry Pruss says

    Kind of like Quebec, really.

    IIRC, all 9 or 10 million Franco-Canadians and Franco-Americans are descendants of about 800 young French women sent by king to Canada between 1663 and 1673.

    Yes, only more severe in terms of impact on genetic diversity. There are other parallels, like hereditary inequality of the reproductive success further reducing the effective bottleneck size (in Quebec it was linked to the size and location of the land holdings which critically affected reproductive success, while in Eastern Europe, it is also partly an economic-niche effect but also augmented by restrictions on who can marry / reside / have a business in town, and on the social class-selective military draft). Or like continuing migration of the clergy / the rabbis long after the initial migrant waves settled.

    Inland Finland is the Western world’s other example of severe historical founder effects.

    (Yes, I realized that we discussed Bohemia and Mazovia and the migrate-and-send-sons-further-afield modus operandi of the Litvak Jews before, but people already found it before I could even get to it)

    The specific problem with the Ashkenazi bottleneck effect is the historians’ tradition of ascribing it to the pogroms and community annihilation of the Middle Ages, crusades and plagues. It is a superficially plausible and kind of attractive hypothesis but it doesn’t survive the detailed testing. So the ethnogenesis of the Ashkenazim becomes somewhat less unique … also primarily driven by migrations in search of opportunity, like in so many other peoples, and less so by the persecution of a kind uniquely experienced by the Jews.

  82. One question I have – one of the points of Tim Snyder’s Bloodlands is that the systematic murder of Jews in the Holocaust happened in a particular region. And that in some ways, it was a consequence of the Holodomor, which had terrorized many of the same people, destabilized the same areas, drew its blood-price from the some of the same populations.

    And even prior to this, at least in the US, Jewish immigration was heavily weighted towards those who lived in Germany and Western Europe.

    So the Jewish populations of today may not reflect the Jewish populations of the 19th century very well. Would studies of the genetics of Ashkenazi Jews that didn’t attempt to oversample people whose heritage was in Eastern Europe be prone to overstate the bottleneck?

    Also, in another reference to ugly points of history, the Quebecois bottleneck is also overstated, since Franco-Canadian heritage includes native women not counted in that figure.

  83. Ryan: And even prior to this, at least in the US, Jewish immigration was heavily weighted towards those who lived in Germany and Western Europe.

    I don’t think so. My recollection of the genetic situation is that most American Jewish ancestry comes from central and eastern Europe. American Jewish culture is definitely more based on eastern practices than western. The very trop used for reading the Torah is different between American synagogues and those in western Europe today; American Jews use the eastern European version.

    Moreover, while the Ukraine in the 1930s and 1940s was certainly a horrific place in multiple ways, calling the Shoah a consequence of the Holodomor is pretty absurd. It makes no sense ideologically or geographically (Poland having been the center of the Nazi extermination campaign and the also the country that lost the largest number of Jews).

  84. Dmitry Pruss says

    at least in the US, Jewish immigration was heavily weighted towards those who lived in Germany and Western Europe

    Depends on what era you are taking about. Before the 1860s, it was generally true. After the Polish uprisings, immigration from Poland and Lithuania increased, Catholic but also Jewish. And starting from late 1880s, a rather conscious effort of the Czarist government to reduce their Jewish population by promoting emigration created a near-deluge of Russian Empire Jews.

    But the “Jews who lived in Germany and Western Europe” in mid-XIX century were to a great extent of Polish extraction, a consequence of the Partitions which made masses of Polish Jews subjects of Prussia and Austro-Hungary. The Prussian partitioned area of Poland generated particularly strong migration currents. And from the areas further East, Eastern Belorussia Jews moved to Belgium and the Netherlands and Courland Jews, to Germany, supplanting the local communities.

    It’s true that the Holocaust and migrations left us with virtually no original regional Ashkenazi populations to study. There remain nearly-endogamous cultural groups to this day, but there is a well-known gap of misunderstanding between geneticists on the one hand, and linguists, historians and anthropologists on the other, and so the geneticists to this day sample the Ashkenazim by the modern country of origin or by continent, rather than by a more granular cultural and dialectal afficilation. One possible exception is the South African Jewish community which still closely corresponds to the Kovno Governorate Litvaks of yore, due to an accident of steamer line cartel policies of the XIX-early XX centuries.

  85. Sorry, Brett. One lesson for me is never to make asides when talking about the Holocaust, or I’m certain to sound fatuous. To the point of offensiveness. “To some degree a consequence” wasn’t central to my point, nor was it what I really meant. Of course Hitler, the Nazi Party and the willingness of too many Germans and others to commit genocide take overwhelming responsibility for the Holocaust.

    I was trying to allude to Tim Snyder’s point in Bloodlands, if I remember the argument properly, that both the Holodomor and the Soviet invasion after the Stalin Ribbentrop pact destroyed social and governmental structures before the Nazi onslaught, which in other places would diminish the ability of the Nazis to achieve total control. That in the rest of conquered Europe, the genocide had to contend with, not so much sympathy for Jews, though in some places that helped at the margins, but punctiliousness about what Germans could do to Jewish citizens of countries that remained sovereign, even if only nominally. Jews and others in the Bloodlands had effectively become stateless, and he argues that had a significant effect on whether large numbers of people survived. That was what I was trying to say with “to some degree a consequence.” That the brutalization of the region by the Holodomor had some impact on how many people were killed later.

    Naturally, I can’t find my copy of Bloodlands right now, and I imagine someone here knows the book and/or the facts better, and can explain how I’m still mangling it. But perhaps this does a bit more justice to Snyder than I did previously.

    But my main point in mentioning the Holodomor wasn’t that it had any impact on the Holocaust, but simply that many Ukrainian Jews were victims of the famine even before the genocide, and that this compounded the focused effect of the Holocaust in destroying particular Jewish communities.

  86. Ryan, most Ukrainian Jews leaved in cities and towns, not in the villages. The Great break and the famine decimated mainly villages. Far it be from me to praise Soviet system, but I should note that in USSR a good number of Jews were evacuated during the war and many men fought Nazis, not just suffered from them (obviously, many suffered). It also worth remembering that the occupied territories were under the scorched earth warfare on all sides.

    I am not sure how much trust one should put into the Wikipedia figures, but if they are true, Western European Jews faired much less badly (numerically speaking) than Eastern European, independent of the country. This speaks mainly to a relative comparative restraint that Germans exercised in the West. USSR, in 1939 borders, did relatively less badly than other Eastern European countries, perhaps mostly because a lot of Jews lived on never to be occupied territories and some were evacuated.

  87. The USSR Jews who ended up in Nazi hands (either because they couldn’t flee, or because they couldn’t flee fast enough and were overrun by the German advance, or because they were taken prisoner), they fared much worse than Polish or Western Jews. The rule of thumb is, everyone in this misfortunate category was dead, typically in 1941. A few women survived by white-passing, a few able-bodied men joined the guerrilla partisans, but these options were for a tiny fraction of one percent. There were no concentration camps out East, just bullets in the nearby ditches or quarries. In contrast, the mass extermination in Poland and Western Europe gained steam only by 1943, and some people survived in the camp, in ghettos, and in deportation queues. A minority, but not an infinitesimally tiny minority like in the USSR. They were more survivors overall in the USSR because people were called up or evacuated in an organized fashion or just fled ahead of the German advance. Also because a part of the USSR was in the Romanian occupation zone, which was about as bad as Poland, meaning that a small percentage survived. Unlike under the German occupation to the North.

    Snyder’s Bloodlands tried to bring more light to the Nazi atrocities outside of the better-known concentration camps, and it’s a commendable goal in itself, but the actual goal of the book is broader. It is to expose both Hitler’s and Stalin’s crimes and to elevate the idea of their moral (immoral) equivalence. This, in itself, might be also a worthy goal, but it really comes out weird. It begins to look like a regurgitated Nazi propaganda point that Stalin’s / Kaganovich’s / “alleged Jewish-Comminist” crimes against Poland and Ukraine justified, or paved way to, the Holocaust. If the book juxtaposes both regimes’ crimes in such a way that in the eyes of the readers, it blames the victims, then it’s a really bad way to narrate the history.

  88. X, I didn’t see Bloodlands as lending itself to a theory of blaming the victims, and I’m sorry if my initial cursory statement gave the impression I was doing so.

    This article doesn’t give the impression that the famine left Ukrainian Jews untouched. Nearly 1,000 Jewish dead on the streets of Kyiv in one month, numbers of dead and starving in various stetls, more than 1000,000 Jews who had been encouraged to become (well, suppressed into becoming) agricultural workers and suffered the same fate as their neighbors. Ihttps://www.academia.edu/43629946/The_Holodomor_and_Jews_in_Kyiv_and_Ukraine_An_Introduction_and_Observations_on_a_Neglected_Topic

    “Your mother died from starvation… Her last wish was that you, our only son, say Kadish for her.”

    It’s hard to see how it could be otherwise, with most Jews living in small towns dependent on the produce of the nearby collectives whose output was being sent elsewhere.

    The scale of death was different in towns, more so in cities, but I suspect the idea that Jews didn’t die in large numbers in the famine is largely a product of the anti-Semitism some adopted as a way to try to make sense of the calamity that rained down on the peasantry.

  89. Jews who had been encouraged to become (well, suppressed into becoming) agricultural workers

    Jewish agricultural settlements in the Ukraine are not Soviet invention, but an old Tsarist project dating back to early 19th century.

    It is often described as a failure, it probably was if we consider that majority of Jews in Tsarist Russia remained town dwellers instead of becoming prosperous peasants as the Tsarist government wanted.

    But on the other hand – several hundred thousand Jewish farmers in the Ukraine at the eve of Revolution ought to count for something.

    PS. I particularly liked an interesting linguistic detail about this experiment. The Tsarist officials understood that the Jews had no knowledge of farming, so several German farming families were allocated to each Jewish agricultural settlement to teach Jews how to farm. As the bureaucrats noted, the Jews and Germans understand each other since they speak “the same language”.

  90. Kuznetsov supposedly wrote that the Holodomor made the population of Kyiv so inured to death and suffering, that it dulled their reaction to the Babi Yar massacre.

  91. LH – it seems like Google no longer indexes this site? Or perhaps does it too infrequently? “Recently commented posts” page was out of commission and I tried finding this discussion as
    https://www.google.com/search?q=site%3Alanguagehat.com+beider+&biw=1366&bih=625&tbs=qdr%3Aw&sxsrf=AOaemvLQEXqvB08t39zmXDQ0ABr4Adx5Ug%3A1636653917282&ei=XVuNYYe4ELqyqtsP67ap8A8&oq=site%3Alanguagehat.com+beider+&gs_lcp=Cgdnd3Mtd2l6EANKBAhBGAFQ_RJYnh1gpCNoBHAAeACAAWOIAcQFkgEBOJgBAKABAcABAQ&sclient=gws-wiz&ved=0ahUKEwjH2u_68pD0AhU6mWoFHWtbCv4Q4dUDCA8&uact=5

    but no such luck.

    @Ryan – Holodomor (as opposed to the more general famine and starvation in the USSR in the same timeframe) is specifically understood as genocide against the Ukrainian people by the enemies who are officially ethnically blank Soviet, but in the Nazi propaganda, there were no such ethnic ambiguity about the alleged masterminds of the genocide.

    Both Czarist and Stalinist Russia were of two minds about the Jewish agricultural activities. There were numerous rounds of eviction of the rural Jews, the earliest known to me was the mass eviction which uprooted my ancestors and killed many of their fellow Jews in the dead of the winter 1825 near Vitebsk; the biggest known was a part of the “Temporary Statute” of 1883 (which, despite its name, never expired until 1917) which made another relative of mine sell his apple orchard in Podolia, and, after a few destitute years, emigrate; and latest one, which also affected a branch of the Vitebsk area family, happened right before the Revolution, and affected the Jewish veterans of the Russo-Japanese war of 1904-1905 who were granted residency privileges after coming home, only to be stripped of the right to live in the villages 10 years later. In the USSR times, collective / communal farming was promoted, but the better-off Jewish farmers were literally run off the land.

    The agricultural colonies of Kherson and Tavria Governorates of Southern Ukraine (both Jewish and German) are probably the best known because after generations of misery, they achieved lasting success and their population grew in numbers and influence. But most Jewish villagers were not Kherson colonists (in Podolia and on the North-Eastern fringes of the Pale, a huge percentage of the Jews lived in the villages, and in Polesye, were colonists on the infertile government lands, although with a few exceptions, they didn’t till the land, but worked as blacksmiths, horse-dealers, fishermen, loggers, ash-burners, cart drivers, orchard-keepers etc., all of them being occupations where the head of the household could make living without much reliance on the women and children in the household … because in the traditional family, the women weren’t expected to work outside at all, and the boys needed to study long hours)

  92. LH – it seems like Google no longer indexes this site?

    No, it’s keeping up the good work. It’s true it doesn’t find this thread if you search on “beider,” but it’s the second hit for “Yamnaya.”

  93. all of them being occupations where the head of the household could make living without much reliance on the women and children in the household … because in the traditional family, the women weren’t expected to work outside at all, and the boys needed to study long hours

    How are these occupations less reliant on the family than farming? Surely some farming could be adapted to be a one-man job, albeit less productive?

  94. In general, the farther east they were, the more brutally the Nazis behaved. Some of the reasons for this seem to have been largely just contingent, but there was also a significant ideological component lying behind these geographical differences. Hitler proclaimed that the people in the East were Untermenschen. Slavs were worthy only of behind enslaved, but the more Asiatic groups even further east were considered even lower. In contrast, however much the Nazis may have despised the French and considered them decadent, they never denied that France had a distinguished and culturally meaningful history.

    Of course, no ethnic or religious group—except maybe the Romany (the most authentically Aryan group in Europe, actually!)—was considered as vile as the Jews, but the level of brutality of Nazi actions against the Jews in a particular region was often influenced by how brutal the Nazi regime was there more generally. The broader milieu affected the level of background barbarity to which the both local Jewish and non-Jewish populations were exposed. In largely democratic western Europe, where Jews were typically better integrated into the national culture and identity, the occupation regimes were generally less violent and more accepting of the existing local communities and power structures. There was at least a minimal degree of respect for the occupied peoples in places like France, Denmark, or the Netherlands and a concern that the non-Jewish populations of those countries would object more strongly to actions taken against their Jewish countrymen and countrywomen. In contrast, in the Soviet Union, the invading Germans viewed the existing people as fundamentally corrupt—politically, socially, and racially (with Jewish influence being perceived as practically omnipresent). So the Germans had few qualms about conducting mass executions in relatively public fashions, using Einsatzgruppen to machine gun the local Jews and other undesirables, rather than the more secretive death camp system that took over most of the killing in 1942.

  95. Dmitry Pruss says

    some farming could be adapted to be a one-man job, albeit less productive?

    Never been a subsistence farmer in Belarus myself, but I imagine that with the poorly fertile soils and colder, wetter climates, it’s already “less productive” and without literally throwing all people at work dawn to dusk when the weather allows making hay or harvesting, one risk certain death in winter. The government policies explicitly discouraged peasants’ children from attending schools because the boys’ labor was desperately required in the fields.

    Interestingly, Jewish farmers weren’t officially put in the specific sub-class of peasants, but instead, in the broadly equivalent subclass of “earth-tillers” (земледельцы rather than крестьяне), even though they didn’t plow the soil. The post-1883 “Temporary Statute” evictions paperwork is a golden trove of documentation on Jewish families and their economy, because the papers were filed with the police departments and ended up preserved much better than vital or tax records (everything related to persecution was so much better preserved in Russia!). So generally one can learn occupations of all families, both grandfathered under the statute and the evictees. Up in a very rural Nevel district, for example, nobody farmed. My relatives there, the Konsons, were village blacksmiths, burly light-haired people who picked equally strongly built brides, and who looked genuinely Slavic despite being, as the name suggests, real Kohanim. Some rural residents of Nevel held not-so-rural occupations, like cobblers or small traders, but most just occupied the genuinely rural but non-plowing economic niches. Another branch of rural Neveler relatives, the Neploks, were generally into anything about horses. Drove carts, traded horses. One police file investigated a Neplokh who improperly rented land from a Christian villager for a horse-pasture. Another police file investigated a suddenly-missing Neplokh who was never found, but was rumored to be on trial in the neighboring county for horse thievery. Not the classic “poor huddled masses” of the shtetls.

  96. Neplokh

    Not bad!

  97. Dmitry Pruss says

    Goodenough is a classic English surname

  98. Nekrasov of Neyelovo, Neurozhayka tozh follows a different approach.

  99. >Not the classic “poor huddled masses” of the shtetls.

    The following is an understanding I drew from The Golden Age Stetl, a New History of Jewish Life in Eastern Europe, by Yohanan Petrovsky-Shtern.

    Stetls were chartered trading sites, literally belonging to nobility or the state, protected by law, by monopoly profit, and by networking effects. Traditionally, peasants couldn’t break into a business like horse-trading, practically or legally. One could decide to adopt a town-based occupation without making the whole family work because much of the competition was locked out.

    >For example, the 1740 agreement between a magnate and the Zaslav (Iziaslav) Jewish communal elders outlined the key functions of the Jews, who would organize five annual fairs during Eastern Orthodox and Catholic holidays such as Spas (Savior) and St. Martin in the old part of the town, and another four, also on Christian holidays such as St. Peter and St. Virgin Mary, in the new part of the town, and still three more brief fairs in the town’s central square.

    This was both a right and a privilege. The community had to fulfill the terms or lose out. And they paid taxes, unlike peasants. But no one else could conduct trade in the stetl, and typically, that meant no one else within easy travel. A peasant from a village around Zaslav basically had to buy his tack, his horseshoes and his liquor from the stetl, or newly, the “mestechko”.

    Petrovsky-Shtern goes on to say that the “marriage of convenience between Jews and magnates was in reality not an equal partnership and sometimes took the form of humiliation, exploitation and abuse.” I’ll bet.

    But the privilege, the exclusion of peasants from such trade and industry, is certainly the answer to why the occupations of townsmen could give a life of leisure.

    He argues that the system began to break down after partition, because the new Russian bureaucrats didn’t feel the same tenderness and obligation towards the szlachta landlords.

  100. Dmitry Pruss says

    Petrovsky-Shtern captures a snapshot of the shtetle history between the end of early period when the Jewish business was largely limited to financing and customs collection, and the dissolution of the Jewish communal self-rule. His title has the word “Golden Age” for a reason. Those towns (some on government land, but many on magnate-owned lands) with their cozy hereditary market stall arrangements were quite different from the hamlets of the “huddled masses” of Lady Liberty, separated in time by two or more generations. Many opportunities disappeared in the interim. First, Eastward migration stopped. Then, migration to the South-East reached near-saturation as well. Finally, migration to the smaller hamlets which didn’t have a shtetle status was permanently closed by the “Temporary Statute”, locking most of the community in the same old and increasingly overcrowded places. The end of community self-rule opened up all opportunities for competition. Other traditional lines of business were decimated with the end of Polish magnate landownership, and with the construction of the railroads which undermined the traditional commodity-transportation and export businesses. And finally the system of skilled-artisan guilds was abolished too. In the end, if something was still guaranteed for life, it was the economic and legal uncertainty.

  101. We discussed Petrovsky-Shtern’s book in 2014.

  102. In August of 2014, to be specific. And I remember where I was when I was reading the book, at my in-laws’ dacha, which we always visited in August at the time. It must have been 2014, obviously prompted by the mention here.

    >migration to the South-East reached near-saturation as well. Finally, migration to the smaller hamlets which didn’t have a shtetle status was permanently closed.

    Dmitry, sure. But saturation is such an interesting word. Another way of describing saturation is that the monopolies were so lucrative that they had so many kids that there simply weren’t enough towns for them in all of Slavdom! Increasingly, it was hard to find a spot that ensured the wife and kids didn’t have to work, as you point out.

    At the time, at least some of my ancestors were killing Native Americans to make way for their kids. Or moving into territories recently cleansed. I’m not pointing fingers in any way that doesn’t ultimately lead to Algren’s captain, whose “finger of guilt, pointed so sternly for so long across the query-room blotter, had grown bored with it all at last and turned, capriciously, to touch the fibers of the dark gray muscle behind the captain’s light gray eyes.”

    But the myth-making must be as jarring to folks descended from the peasantry as a cowboy movie for those of a different heritage or a different sensibility.

  103. Dmitry Pruss says

    I am not sure what my decidedly non-monopolistic but sometimes cutthroat businesslike ancestors did to draw a parallel with the killers of the Native Americans or maybe those who enslaved and raped the Africans. Actually I AM sure that there are all sorts of different degrees of descendant’s guilt but being a descendant of a severely persecuted & dehumanized minority goes a long way to alleviate guilty feeling of the sort. So I strongly suggest that you stop measuring your alleged ancestor myths against mine.

    But just to dispose of the idea of the old shtetle Jews being the same as hereditary monopoly merchants. Yes, the private towns’ community self-rule books repeatedly describe which family shall trade in what goods at which street. But it is a bias of ascertainment to conclude that this is what the community was doing. You see, the self-rule council (the Kahal) was an intermediary between the landowner and the taxpayers, and accordingly, it regulated the matters of commercial real estate. Because commerce meant taxes and real estate meant the town’s owner’s property. The Kahal also regulated public bathhouses and houses of worship, for exactly the same reason. It did not regulate any trade outside of the town owner’s properties, or employment of artisans and laborers. Most of the moneyed class’s income was from outside of the city street, typically in contract management (factorship), tax collection (otkup) and sales of agricultural commodities. One of my ancestral families, the Lapitskys, were among the richest merchants in such privately owned towns, Gorki & Romanovo, but their businesses weren’t shop fronts. It was the otkup collection & shipments of local and Ukrainian grain up the Dnieper, on horsecarts across the divide, and down Dvina to the export markets in Riga. The patriarch of the family boasted of riding a carriage with 8 horses because the town’s owner had 6. A whole group of top merchants was involved in such activities covering the wide area, and in no way restricted by the Kahal. Like in nearby Shklov a group of merchants specialized on importing diamonds and jewelry & wholesaling it in big cities. It wasn’t a local storefront business, and therefore it wasn’t regulated in any special way. Then a bit lower rung of the merchant class was actually running local retail through the Kahal-controlled storefronts. A lot more Jews, as you can find from the Census (Revision) records, were designated skilled artisans of the guilds (цеховые) – bakers and brewers, cobblers and tailors, coppersmiths and blacksmiths and so on. Lots and lots of them compared to the small merchant class. But most of the town’s Jews were even lower on the social scale and belonged to the “plain” townsfolk classes. Laborers and porters, house painters and glass installers, cart drivers and bath workers (and many peddlers and traders in this same category) (and most rabbis and melameds and spectors belonged to the plain townsmen category as well) … none of these occupations had guilds, but they far outnumbered both the guild members and the merchants even early in the XIX c. (Before 1795, the censuses are in Polish and therefore much harder for me to read, but I read plenty of pre-Napoleonic ones). For Gorki there is actually a 1770s Census surviving in a 1784 Polish copy and annotated in Russian, and from it, you can see that a private town wasn’t this imaginable neat formation of shopfronts even before the “Golden Age” book. It’s just the Kahal specifically regulated the storefronts and that’s why we know so much about them.

    But in most of the occupations, the employment opportunities didn’t grow as fast as the population, so the sons had to diversify into new opportunities or move to the frontier. Nothing an American wouldn’t understand, i suppose.

  104. Basically, the Jewish population of the “Pale of Settlement” was demographic equivalent of the Israeli Haredim today who are currently at about 12.6% of Israel’s population, but double every 16 years with probably the highest population growth rate in the world.

    This has to account for a lot of what happened there later.

    I wonder if anyone compared Ukraine-1919 with Rwanda-1994.

  105. Employment opportunities didn’t grow for peasants either. But they were locked out of competing for town jobs and barred from moving, at least on their own decision. They ate less and had fewer kids. The urban class in the region was, as you point out, communal. Though it was not literally hereditary, and people competed within the class, the communal aspect was enforced by law and social norms. While higher level businesses like shipment of agricultural surpluses were less regulated, the peasants still had to sell their surplus to the stetl. The export firms grew out of the strongest of the firms benefitting from protection in the stetl.

    I only made the post after repeated mentions of how oppressive it was to be in a shtetl. I do believe that. But not nearly so oppressive as to be in a village. That’s a truth that shouldn’t be erased.

    I make no comments about your direct ancestors as individuals. I mentioned my “ancestors” by way of saying there is oppression behind many stories. I was saying “people have done worse.” It seems strange to get anti-American insults precisely because I raised the fact that there are such horrors in American history. I also meant it broadly, my ancestors as a class. I know of no relative who fought on a frontier, nor even moved in very quickly afterwards. None of them lived anywhere that slavery was legal in the last 210 years, nor in areas where slavery was widespread in the last 280, where the trail goes dead. Arguably, they benefited indirectly from slavery elsewhere.

    The timeline of abolition across the US is not that different from the timeline of the emancipation of the serfs from Poland to the Russian empire.

    Those of my direct ancestors that weren’t European in the early 1700’s were all Pennsylvanians, to our knowledge. William Penn bought the land they moved into, and relations with natives were still good when they appear in the genealogies. Some moved into eastern Ohio two decades after the Battle of Fallen Timbers in western Ohio. That’s as close to a frontier as anyone ever got.

    Still, they benefitted indirectly from the later, pressured sale of Native lands in Pennsylvania, and from conquest elsewhere, just as others did from the conquest of Tatar and Muslim lands, as you seem to acknowledge. Opportunities South-east that were eventually saturated.

    As you know, the DNA stuff is littered with references to an inexplicable demographic miracle that needs some explanation. But it seems explicable.

    The stetl was an institution that oppressed the peasantry and advanced the townsmen, even as it served the interests of a higher class that routinely punched downward at the stetl, and was entangled with a lower, agricultural class that sometimes bit savagely upward. That doesn’t taint your ancestors any more than mine are tainted. Maybe less.

    I disagree pretty strongly that it “accounts for a lot of what happened later,” as was just posted above now that I’m about to post my comment. I don’t agree with the premise, much of it is at a remove of a century or more, and I find it an unfortunate way to refer to genocide.

    But the fact that later horrors occurred doesn’t make the truth of the stetl’s place in an oppressive society unspeakable. If anything, attempting to elide that history has fed the demon.

  106. Ryan, judging by the numbers there are about 45 million African Americans in the US and about 3.5 to 5 million Native Americans. This is very different from the situation in 1600. How can we explain this demographic miracle? Maybe African slaves and their descendents under the protection of European emigrants took all the opportunities for development from Native people? I hope no one would take this sort of argument seriously.

    Jews didn’t establish serfdom in Western Ukraine and there was plenty of oppression in Eastern Ukraine and in Great Russia where there were no Jews at all. Jews also didn’t decide that the extractive economy is the best way to organize life in Western Ukraine. And somehow people who denied Ukrainians the possibility of development are the Jews. This is completely and entirely absurd. In Eastern Ukraine and in Great Russia merchant classes were Ukrainian and Russian. Did it make oppression of the peasants any better?

  107. . Arguably, they benefited indirectly from slavery elsewhere.

    Well, I absolutely benefit from labour of people who sewed my clothes elsewhere. (but they are not angry at me, because they do not see me).

  108. Is it possible to discuss it without distributing guilt/blame/responsibility?
    “Saturation” is a neutral term.
    “Serfs were oppressed too” is a valid point.
    The claims made above are fine, but their moral interpretations…

  109. PlasticPaddy says

    @drasvi
    It is hard to avoid tendentiousness in “X prepared the ground for Y” or “Group A was privileged in comparison with Group B” statements, because they lead to associations and linkages not justified by the facts. For example, “The provisions of the Treaty of Versailles prepared the ground for the rise of Fascism and associated authoritarian movements.” Some people could and did argue that the Versailles settlement inevitably led to, and provided justification for, various actions of Fascist agitators and governments. But I do not believe such an argument provides anything like a full explanation, only “window-dressing”. The other point is that inter-ethnic rivalry/conflict is not a dialectical process, leading to a synthesis😊. It is more like a fire that flares unpredictably, but never burns out.

  110. It seems strange to get anti-American insults precisely because I raised the fact that there are such horrors in American history.

    There were no anti-American insults that I could see; Dmitry Pruss said “I am not sure what my decidedly non-monopolistic but sometimes cutthroat businesslike ancestors did to draw a parallel with the killers of the Native Americans or maybe those who enslaved and raped the Africans,” and that was my reaction as well, even though I’m neither Jewish nor of Eastern European descent. It seems an unfortunate and unhelpful comparison.

  111. >And somehow people who denied Ukrainians the possibility of development are the Jews.

    I’m not trying to say that. I’m trying to make a smaller point, that a class or group that is deeded a legally protected, significant economic role denied to a much larger portion of that society is a privileged group. Not the most privileged group, nor the most responsible. I tried to make those points, but maybe not strongly enough. I did mention a number of times the ways the system bit back.

    I certainly don’t balk at saying, as D.O. asks me, that the merchant class beyond the Pale benefitted from the system that oppressed the serfs in the same way. No, that didn’t make the oppression of peasants any better.

    I don’t think any of those things undermine the point I was making, which I continue to think is fair, and gets lost sometimes. But again, it’s a small point, in a whole world of oppression and privilege. Perhaps to belabor it is to make it loom larger, when it really belongs against the backdrop of a thousand other groups and classes that have benefitted historically from systems of oppression, so I’ll drop it now.

  112. Dmitry Pruss says

    no anti-American insults

    Thanks LH , I was starting to get worried that I sounded offensive when I only wanted to sound perplexed.

    even though I’m neither Jewish nor of Eastern European descent.

    I suspect that most people of Western descent also have ancestors from the members of Medieval and post-Medieval trade corporations and guilds, which all regulated and restricted who can do business where and how, and typically passed it from farther to son. Sometimes this self organization of businessmen and artisans achieved wonders of economic and technology development , but perhaps just as often, the micromanaging regulation stifled innovation and caused monopoly stagnation… but that’s simply how they thought the economic activity must self organize. I imagine that they both took some lessons from, and simultaneously needed to counterbalance, the feudal power, and that’s how trade corporations became what they were. Out East, these forms might have lingered into obsolescence a bit longer, but there isn’t really anything narrowly specific to the shtetle in these types of business self-regulation. In fact some of their vestiges still linger today in the US, in commercial zoning and licensing and in a few officially sanctioned guilds like the electricians

  113. Yeah, and all that stuff is so complex and confusing that I tend to ignore it, even though I know I shouldn’t.

  114. >worried that I sounded offensive when I only wanted to sound perplexed.

    On first read, the passage about rapes of African Americans felt like “my ancestors were much better than yours” particularly since I doubt that my ancestors had much to do with any slaves. But that is not a fair assessment of what you wrote. And I imagine reading my earlier post felt the same way to you, though that was not really what I was saying.

    A somewhat different point – most of us experience privilege not the way Cortez and his men did, or the Tsar and his boyars, or Thomas Jefferson on his hilltop, but the way stetl residents and Pennsylvanians of the late 18th century did, or as drasvi mentions in his line about inexpensive clothing, mostly as indirect benefits that they/we may not have any useful way to unwind, in a setting in which other things press down on us, and the sacrifices we might make seem to sacrifice far more, individually, than the amount of collective redress they add up to.

  115. Dmitry Pruss says

    I tend to ignore it, even though I know I shouldn’t

    Not sure then if I should add more confusing historical details 🙂 but the basic concept should be simple, I believe. Pretty much all the guild rules date back to the X c. Magdeburg Rights. These town self rule principles gave the merchants and artisans control of their trades, regulating who can do what and how – including even when and how the junior guild members cam marry. The self regulation system included only Catholics. The cities and the Crown soon realized that regulation of the Jewish business was important too, but they couldn’t possibly invite the Jews into the Councils, so they built a parallel regulation system, where the Jews were the Royal chattel and negotiated their trade rules directly with the Crown. By XIV c. Poland, devastated by wars, started inviting Germans and the Jews to establish or rebuild towns, under the Magdeburg Law with the same royal-slaves Jewish addenda. Eventually the Polish crown abandoned its formal role of the legal slave-owner of the Jews, and the German migration slowed, so knock-off quasi-Magdeburg arrangements were put in place, with a similar town’s business community self rule but substituting Catholicism with Judaism.

  116. The Tsarist officials kept complaining that Jews exploit Ukrainian peasants by selling them alcohol so the peasants would then get drunk and ruin themselves.

    This was actually one of the justifications for the Pale of Settlement – if we abolish it, the Jews will come and start selling vodka to Great Russian peasants too and they will get even poorer than they are now.

    Of course, the small point that the Russian government itself essentially employed Jewish tavern keepers to sell vodka to peasants (and derived much of the fiscal revenue from tariffs on alcohol and state monopoly on production of hard spirits) was never mentioned.

  117. Dmitry Pruss says

    the small point that the Russian government itself essentially employed Jewish tavern keepers to sell vodka to peasants (and derived much of the fiscal revenue from tariffs on alcohol and state monopoly on production of hard spirits) was never mentioned.

    As I understand, Jewish tavern-keeping took an outsize importance only after the Partitions when the Czarist government outlawed Jewish arenda, a traditional line of business of middlemen in land leases and agricultural sales, because they worried that it may become a slippery slope to the Jewish land control. Suddenly destitute, many Jews with rural connections turned to rural tavern-keeping, a business which absolutely required some literacy and numbers skills, rare in the countryside, for commercial viability (because the tabs needed to stay open until after the harvest time if one hoped to sell any volume of booze worth doing business)

  118. I’m sure the factual points here are true. But to quibble about adjectives, is a word like destitute properly applied to those who can let others hold their tabs open till harvest time, and can send their kids (200 years ago) to school, rather than to those who can’t afford their summer purchases till after harvest, and can’t afford to let their children learn to read.

  119. Not sure what you’re trying to prove here. These people were better off than those people but worse off than other people. So?

  120. Ryan, I am not an expert on 18/19c Ukrainian agriculture, but a little that I understand is that there are two relatively short (few weeks) periods of sowing and harvest that required enormous physical effort and all available hands. Outside these two periods there was plenty of time for children to get elementary education if there was a will to do so. And obviously, schooling needn’t to be a full-time occupation and children could have keep up doing their chores. There simply wasn’t enough incentive for education because of the serfdom.

    Farmers/peasants in all societies need some system to smooth the consumption (and everyone else as well, but to a lesser degree). Including modern US. I know the banks are not the most popular of institutions, but honestly, do you think keeping farmers’ accounts in modern US is a form of exploitation? Alternative is keeping a stockpile of money under a mattress, which is inefficient now and wasn’t available in the olden days Ukraine because peasants didn’t own mattresses.

  121. People react more strongly than I did, with essentially a raised eyebrow, to inaccuracies that seem to unfairly condition sympathies, especially when they have a deep emotional attachment to the group being ignored.

    It was suggested above that Tim Snyder telling the truth about the deaths of millions of Ukrainians might be unfortunate, because it might get pulled into ugly propaganda. I don’t know what my old Slavic soccer team would make of reading this thread, where mentioning the Holodomor can be called problematic; and pointing out that stetl tavernkeepers were anything but destitute, and in fact had significant resources, unlike the peasants they sold to on credit, is a point not even worth making.

    I doubt it would improve their understanding and broaden their sympathies about the disasters of the 20th century in the way people seem to hope.

  122. Were debt peons who were also legally bound to the land they were on exploited by their various creditors, prime among whom was probably the noble landowner? I suspect it does depend to some degree on the tavernkeeper, the peasant, the estate owner, and other contingencies we can’t measure. But I suspect that it didn’t consistently work out as well for the serf as modern American bank-farmer relationships do. But that wasn’t even my point. To use your analogy, however exploitive modern banks may or may not be, would you call the banker destitute?

    And yet. Americans are in fact annually primed to feel sorry for Jimmy Stewart, the banker and prime vessel for our sympathy with the victims of the Depression. Gah! People.

  123. Perhaps you’re not aware of the “Jews complain too much, they’re well off, maybe too well off if you know what I mean” (with potential segue into running the world through banking) trope. Obviously that’s not what you’re saying, but you seem bizarrely unaware of the problematic nature of your stubborn attitude of “Jews had it pretty good.” It’s like people not seeing what might be wrong with focusing on the biology of race.

  124. Dmitry Pruss says

    I am not sure that further details about schooling, poverty, and suffering of various groups in the early XIX so-called “golden age of the shtetle” are of further interest here. It’s a kind of a niche knowledge, anyway. But if there a burning desire to discuss the actual historical details, rather than general (and generally misguided) attitudes, then please let me know.

  125. If you’d like to give more, I’ll read but refrain from further comments.

  126. Ryan, I don’t think Dmitry Pruss meant that tavern-keepers were destitute. It was an occupation some Jews assumed when other opportunities closed. Some Jews were “destitute” until they switched to doing something else like inn-keeping.

    I don’t think anyone thinks that the death of millions of Ukrainians is not worth remembering, but as anything under the sun, it might simply be irrelevant to some other awful things happening in the same place.

    Everyone who spent any time with the history of Russian Empire knows how completely screwed the system of serfdom was. Whatever minuscule advantage tavern keepers, itenerant merchants, and all sorts of tradesmen might have had while interacting with serfs is utterly irrelevant compared to the fact that people couldn’t improve their lot by simply changing what they were doing or just moving to a different place. Which many demonstrated quite clearly by simply running at the first opportunity they’ve got. It is a bog standard story of nationalism as a tool of the governing class to deflect the blame for maintaining and benefitting from an inefficient (and in the case of serfdom, deeply inhumane) system toward the ethnic minorities, foreigners etc. That makes me sound as some sort of Lenin, but the basic story is true, that’s what it is. The fact that nationalism is a more complicated thing than just a concoction of the governing classes doesn’t change the fact that it is used that way. I am not sure what a modern American example can be. How about blaming Chinese and Mexicans for stealing American jobs? This is about the same level of dishonesty and blame-shifting as blaming Jewish innkeepers and tradesmen for keeping Ukrainian peasants down.

    This sort of finger-pointing is really dangerous in Eastern Europe where everyone learned to blame everyone else for just about anything, but it is simply wrong and there is no need to bring it all here, where it is not dangerous, but just wrong.

  127. Well said.

  128. Jimmy Stewart, the banker

    Distinguo. Henry Potter, his antagonist, is a genuwine American bankster and robber baron of the old school; George Bailey is the founder of a savings-and-loan society chartered to take individual deposits and write house mortgages. It is mutualistic but profit-making, the stockholders being the depositors and mortgagees, and Bailey technically serves at their pleasure. It’s not clear whether Potter owns his bank outright (probably) or has minority investors. In any case, their roles are as different as chalk and cheese.

  129. David Marjanović says

    Genetic history of Croatia: almost pure Anatolian Farmers in the Neolithic, possibly completely unadmixed in some cases; Copper Age individuals indistinguishable from the Neolithic ones and others that look like Corded Ware lived just 60 km apart – notably the latter have the usual amount of Western Hunter-Gatherer ancestry, unlike the former; the one individual from Roman times is indistinguishable from the locals today. There’s also a lot of archeology and anthropology in the open-access paper.

  130. David Marjanović says

    Genetics of Greek-speaking Calabrians: basically, they’re the most isolated of the locals, with no sign of any immigration since the Copper Age. (Least amount of Yamnaya ancestry outside Sardinia.) Also in open access.

  131. Fascinating!

  132. Evidence for pre-Norse occupation of the Faroe Islands, supporting indirect evidence of an earlier population coming (slightly earlier) from the British Isles.

  133. Trond Engen says

    A lot of reading for the holidays. And then it’s the backlog.

    The potential of sedimentary DNA (or broader yet, sedimentary bio-chemistry) is huge and every time I blink the field has expanded in yet another direction. Archaeologists and biologists should be taking broad soil samples from wherever landscapes are destroyed by development and seal them off for later analysis.

  134. David Marjanović says

    That’s an amazing paper. It’s also in open access.

  135. Trond Engen says

    I’ve finally read the papers:

    David M. : Genetic history of Croatia: almost pure Anatolian Farmers in the Neolithic, possibly completely unadmixed in some cases; Copper Age individuals indistinguishable from the Neolithic ones and others that look like Corded Ware lived just 60 km apart – notably the latter have the usual amount of Western Hunter-Gatherer ancestry, unlike the former; the one individual from Roman times is indistinguishable from the locals today.

    Yes, Now I’m yearning for a wide study of the genetics of the Balkan Neolithic..Anatolian Farmers came in several waves and along different paths,

    David M.: Genetics of Greek-speaking Calabrians: basically, they’re the most isolated of the locals, with no sign of any immigration since the Copper Age. (Least amount of Yamnaya ancestry outside Sardinia.)

    Yes. They could be modeled as a mix of Sardinians and Aegaeans, if I understand it correctly. The authors do consider a continuous contact with the Aegaean/Eastern Mediterranean from way before the spread of the Greek language and until the Byzantne Era..

    Y: Evidence for pre-Norse occupation of the Faroe Islands, supporting indirect evidence of an earlier population coming (slightly earlier) from the British Isles.

    Great paper, timing the advent of husbandry to within a few years from 500 CE by detecting chemical signal of mammalian feces and ovine DNA in the sedimentary layers. Before reading I actually imagined they would have been able to use sedimentary DNA even more, e.g. for suggesting the origin of the livestock, but that’s just me expecting too much too fast. But we’ll get there.

    The site of Eyði is at the far end of the Faroes as seen from both Britain and Scandinavia, and the valley where the sediment samples were taken is situated inland from the main settlement on the coastal plain. It’s hardly among the earliest places to have been exploited by agriculture or the first place for free-roaming sheep to have transformed, Unless the first settlers of the islands were primarily exploiting the marine resources and agriculture was a way of broadening the economic base. But there’s no traditional arcaeological evidence of that. How far are we from detecting bio-chemical evidence of human slaughtering of marine mammals in marine sediments?

  136. David Marjanović says

    Open-access paper from October 2021: “The origins and spread of domestic horses from the Western Eurasian steppes”

    Abstract:

    Domestication of horses fundamentally transformed long-range mobility and warfare1. However, modern domesticated breeds do not descend from the earliest domestic horse lineage associated with archaeological evidence of bridling, milking and corralling2,3,4 at Botai, Central Asia around 3500 BC3. Other longstanding candidate regions for horse domestication, such as Iberia5 and Anatolia6, have also recently been challenged. Thus, the genetic, geographic and temporal origins of modern domestic horses have remained unknown. Here we pinpoint the Western Eurasian steppes, especially the lower Volga-Don region, as the homeland of modern domestic horses. Furthermore, we map the population changes accompanying domestication from 273 ancient horse genomes. This reveals that modern domestic horses ultimately replaced almost all other local populations as they expanded rapidly across Eurasia from about 2000 BC, synchronously with equestrian material culture, including Sintashta spoke-wheeled chariots. We find that equestrianism involved strong selection for critical locomotor and behavioural adaptations at the GSDMC and ZFPM1 genes. Our results reject the commonly held association7 between horseback riding and the massive expansion of Yamnaya steppe pastoralists into Europe around 3000 BC8,9 driving the spread of Indo-European languages10. This contrasts with the scenario in Asia where Indo-Iranian languages, chariots and horses spread together, following the early second millennium BC Sintashta culture11,12.

    Also:

    Analyses of ancient human genomes have revealed a massive expansion from the Western Eurasia steppes into Central and Eastern Europe during the third millennium bc, associated with the Yamnaya culture8,9,11,12,21. This expansion contributed at least two thirds of steppe-related ancestry to populations of the Corded Ware complex (CWC) around 2900 to 2300 bc8. The role of horses in this expansion remained unclear, as oxen could have pulled Yamnaya heavy, solid-wheeled wagons7,22. The genetic profile of horses from CWC contexts, however, almost completely lacked the ancestry maximized in DOM2 and Yamnaya horses (TURG and Repin) (Figs. 1e, f, 2a, b) and showed no direct connection with the WE group, including both C-PONT and TURG, in OrientAGraph modelling (Fig. 3b, Extended Data Fig. 5).

    […]

    By around 2200–2000 BC, the typical DOM2 ancestry profile appeared outside the Western Eurasia steppes in Bohemia (Holubice), the lower Danube (Gordinesti II) and central Anatolia (Acemhöyük), spreading across Eurasia shortly afterwards, eventually replacing all pre-existing lineages (Fig 2c, Extended Data Fig. 3c). Eurasia became characterized by high genetic connectivity, supporting massive horse dispersal by the late third millennium and early second millennium bc. This process involved stallions and mares, indicated by autosomal and X-chromosomal variation (Extended Data Fig. 3d), and was sustained by explosive demographics apparent in both mitochondrial and Y-chromosomal variation (Extended Data Fig. 3e, f). Altogether, our genomic data uncover a high turnover of the horse population in which past breeders produced large stocks of DOM2 horses to supply increasing demands for horse-based mobility from around 2200 BC.

    Of note, the DOM2 genetic profile was ubiquitous among horses buried in Sintashta kurgans together with the earliest spoke-wheeled chariots around 2000–1800 BC7,9,23,24 (Extended Data Fig. 6). A typical DOM2 profile was also found in Central Anatolia (AC9016_Tur_m1900), concurrent with two-wheeled vehicle iconography from about 1900 BC25,26. However, the rise of such profiles in Holubice, Gordinesti II and Acemhöyük before the earliest evidence for chariots supports horseback riding fuelling the initial dispersal of DOM2 horses outside their core region, in line with Mesopotamian iconography during the late third and early second millennia BC27. Therefore, a combination of chariots and equestrianism is likely to have spread the DOM2 diaspora in a range of social contexts from urban states to dispersed decentralized societies28.

    Also of note:

    Finally, our analyses have solved the mysterious origins of the tarpan horse, which became extinct in the early 20th century. The tarpan horse came about following admixture between horses native to Europe (modelled as having 28.8–34.2% and 32.2–33.2% CWC ancestry in OrientAGraph19 and qpAdm17, respectively) and horses closely related to DOM2. This is consistent with LOCATOR20 predicting ancestors in western Ukraine (Fig 3c) and refutes previous hypotheses depicting tarpans as the wild ancestor or a feral version of DOM2, or a hybrid with Przewalski’s horses34.

    Early 20th century???

    Anyway, there’s ample Discussion and Supplementary Tables (pp. 38 & 39 of this PDF) about more language-related things, including language spread and language itself.

    No justification is given for the use of neighbor-joining, a method that (if used for phylogenetics) falls hard for almost every known source of error; the sheer size of the dataset would make all other methods hard to use, though.

  137. Trond Engen says

    Thanks. I’ve seen it and tried to get my head around the implications. Wouldn’t it be fun if Anatolian came to Anatolia around 2200 BCE, and the older intrusions into Europe were Non- or Para-IE?

  138. Early 20th century???

    https://truenaturefoundation.org/wp-content/uploads/2019/04/last-tarpan.jpg

    The last captive tarpan died in Moscow Zoo in 1905.

  139. Lars Mathiesen says

    So maybe the CWC spoke a sister language to Sumerian (low odds, though). That whole thing about using IE vocab for horse things in Sumeria, does that fit with the timeframe? (If it’s in one of those appendices, I’ll just wait for Trune to report).

  140. David Marjanović says

    Ah yes, I must have managed to confused tarpans and auerochsen last night.

    Wouldn’t it be fun if Anatolian came to Anatolia around 2200 BCE

    We’ve discussed that Nature paper that showed lots of Anatolian names in southeastern Anatolia/northern Syria down to 2500 BCE.

    If it’s in one of those appendices

    Those, too, are openly accessible, and I linked to the PDF with the language-related stuff in it. There’s no mention of Sumerian, though, and no text, just tables of the distribution of various horse-related words across IE branches.

    The only one given for all of Anatolian, however, is Hieroglyphic Luwian a₂-su “horse”. Likewise for Tocharian. This fits their argument that the domestication of horses happened after Anatolian and Tocharian had moved off.

  141. I for one will be amused if it turns out that David Anthony’s synthesis was right in broad outlines and many details and went wrong only where and to the extent he relied on equine archaeology, which is his specialty. I did some looking around last night, before which I hadn’t realized the degree of skepticism for his theory of early domestication/riding, which this study reinforces.

    Maybe that’s natural – you’re only willing to advance risky ideas in the heart of your own investigation, relying more on consensus or at least a growing body of scholarship to fill in your understanding of the areas around your focus.

    >Sumerian

    I thought the relationships between Corded Ware, Sintashta and the IE arrival in Southwest and South Asia were fairly well understood, making it likely that Corded Ware spoke an IE language. Anything is possible, but eliminating Corded Ware from the IE tree would open a big gap between Indo-Iranian and the western families in need of explanation and connection, without any clear alternatives on a scale that could explain later IE dominance, wouldn’t it? Is there a body of scholarship pursuing the idea that Corded Ware was related to Sumerian, or non-IE?

    The study David linked to didn’t seem to challenge CW as IE. Though I thought the throw-away line they wrote — more or less “maybe the Steppe just took advantage of an independent Neolithic population decline…” — was idle speculation that should have been left out.

    Speak to what you’ve proven. I would have left it at “With little evidence of a strong equine presence in Corded Ware and clear evidence that an equine population explosion took place very late in the 3rd Millennium, our study casts doubt on the idea that the steppe expansion into northeast and north-central Europe in the early 3rd M. arrived on horseback.”

    On the other hand, I’m inclined to wonder whether CW did in fact utilize horses at a somewhat lower scale, building the knowledge and networks that would make this new, more manageable breed of horse so desirable and successful once the adaptations took place, were recognized and bred for. It sounds like it was a genetic package, and it would take some time to center all such adaptations into an otherwise genetically healthy line or set of lineages.

  142. ə de vivre says

    That whole thing about using IE vocab for horse things in Sumeria, does that fit with the timeframe? (If it’s in one of those appendices, I’ll just wait for Trune to report).

    I think it’s fairly uncontroversial that horses began to enter Mesopotamia from the north at the end of the third millennium via Hurrians who had come into direct contact with Indo-European speakers. The amount of IE-derived vocabulary that made it into Sumerian (usually via at least Hurrian and/or Akkadian) isn’t clear, and fits with the importation of vocabulary related to a novel technology. Rubio’s On the Alleged “Pre-Sumerian Substratum” remains probably the best summary of loan-words in Sumerian.

  143. David Marjanović says

    Is there a body of scholarship pursuing the idea that Corded Ware was related to Sumerian, or non-IE?

    No.

    The study David linked to didn’t seem to challenge CW as IE.

    No, it only means to drive another nail in the coffin of the idea that the IE-speakers who founded CW did so specifically by riding in and massacring people.

    Hurrians who had come into direct contact with Indo-European speakers

    Specifically speakers of pre-Indic. Do a tiny bit of internal reconstruction on Vedic, and you get that language (as far as can be told from a few cuneiform characters).

  144. >by riding in and massacring people

    Right. Now we know they raced to and from the massacres in their ox carts.

  145. David Marjanović says

    Rubio’s On the Alleged “Pre-Sumerian Substratum”

    Masterful use of “No comment is necessary here.” at the end of footnote 1.

    However, I find the treatment of Whittaker’s “Euphratic” too short and of course outdated, because Whittaker didn’t stop in 1998: see the discussion starting here, with a link to the 2012 book chapter which almost competely lacks mentions of internal reconstruction of Sumerian and mentions features of the cuneiform script that Rubio did not bring up at all. Much more on the script is here (from 2001, though).

    Of this paper (2004/5) I’ve only read the first footnote:

    In each of my previous articles I have sought to explain that the terms ‘Euphratean’ (for the culture and society) and ‘Euphratic’ (for the language) were chosen for convenience – that is, for lack of a better term – and do not reflect a theoretical unity or relationship with Landsberger’s 1944 ‘Proto-Euphratic’ substrate, nor an affiliation to Oppenheim’s 1964 ‘Euphrates Valley’ civilization. I referred in my first article (Whittaker 1998: 114) to “this Indo-European language, which I shall henceforth call Euphratic in deference to (but also in distinction from) Landsberger.” In my next paper (Whittaker 2001: 41 n. 9), I wrote: “Neither this term [Euphratic] nor its referent should be confused with the ‘Proto-Euphratic’ substrate proposed by Landsberger.” Despite such disclaimers, and despite personal exchanges with him on this matter in 2002, Rubio (most recently, 2005a: 323-324) persists in misrepresenting my position, alleging that “Whittaker attempts to identify the pre-Sumerian substratum (Landsberger’s “proto-Euphratic”) with an as yet unknown Indo-European language.” This strategy, which smacks of gross academic dishonesty, is embraced in order to criticize, as he has done before (Rubio 1999: 6), the “lack of coherence” of the substrate theory, although there clearly is no such unitary theory shared by Landsberger and others – nor any intrinsic reason why there should be one.

    and the other two footnotes, which scold Rubio some more.

    Also interesting: this undated work (youngest cited reference from 2008) on the “fire in water” complex in Sumerian and IE mythology.

    Nowhere mentioned in any of this is that the IE “wine” word has an IE-internal etymology, which bolsters Whittaker’s conclusions.

  146. >I’ve only read the first footnote.

    He gave his proposed language a name that had already been used for a different but similar proposal, and he’s upset people think they’re related? I don’t need to read anything else.

  147. I don’t have a very high opinion of Whittaker’s internal reconstruction of Pre-Sumerian. He has a knack for combining main-dialect and Emesal word forms in idiosyncratic ways that just so happen to evoke the IE form he has in mind. From Euphratic – A phonological sketch, he associates ‘kilim₂’ (reed bundle, rope) with *klh₂-m- (straw, reed) without mentioning that kilim₂ is usually read gilim₂ and that the Sumerian word for ‘reed’ is gi. He gives *dukud (sweet, good) as the Pre-Sumerian form that gave rise to EG dug and ES zeb, which would suggest a monosyllabic root, but he uses kuk(k)ud to somehow jam another syllable in there. Et ainsi de suite…

    I predict that our idea of what a good Sumerian root looks like will change significantly over the coming years. A lot of evidence is pilling up for syncope in verbal prefixes creating more complex syllable structures than just CVC, and it wouldn’t surprise me if we find more complex forms in roots too.

  148. Man, I love having a Sumerianist gracing this hattery.

  149. Lars Mathiesen says

    This is my strategy: Ask a stupid question, and experts will fall out of the woodwork. Only works in the Hattery, though.

  150. David Marjanović says

    I predict that our idea of what a good Sumerian root looks like will change significantly over the coming years. A lot of evidence is pilling up for syncope in verbal prefixes creating more complex syllable structures than just CVC, and it wouldn’t surprise me if we find more complex forms in roots too.

    That’s promising!

    I don’t need to read anything else.

    A lack of creativity is actually a very good sign in this case…

  151. David Marjanović says

    Evolutionary Trajectories of Complex Traits in European Populations of Modern Humans

    Humans have a great diversity in phenotypes, influenced by genetic, environmental, nutritional, cultural, and social factors. Understanding the historical trends of physiological traits can shed light on human physiology, as well as elucidate the factors that influence human diseases. Here we built genome-wide polygenic scores for heritable traits, including height, body mass index, lipoprotein concentrations, cardiovascular disease, and intelligence, using summary statistics of genome-wide association studies in Europeans. Subsequently, we applied these scores to the genomes of ancient European populations. Our results revealed that after the Neolithic, European populations experienced an increase in height and intelligence scores, decreased their skin pigmentation, while the risk for coronary artery disease increased through a genetic trajectory favoring low HDL concentrations. These results are a reflection of the continuous evolutionary processes in humans and highlight the impact that the Neolithic revolution had on our lifestyle and health.

    That’s the abstract. I haven’t read the paper yet.

  152. Whaaa…?

    Since when are “intelligence scores” a heritable trait, so uncontroversial that it can mentioned by unconcerned conjunction with other uncontroversial ones, such as height?

  153. Yeah, that whole abstract reeks of, well, something that makes me exceedingly dubious.

  154. Blech. They use HDL/LDL cholesterol as a proxy for intelligence, since some studies show that in individuals, some cognitive scores may, kind of, be correlated with their cholesterol scores for whatever unknown reasons (people who start taking statins also get better at crossword puzzles or whatever.) Generalizing from that to claiming that people 30,000 years ago were less intelligent because of some statistical trend of genetic markers is a travesty. It shows, at the minimum, that this paper was not subject to competent editorial evaluation.

  155. David Eddyshaw says

    These would have been intelligence scores recorded in flints. You can tell a lot from flint flakes. The paper is certainly flaky.

    The discussion does in fact hint (and not that obliquely, either) that Europeans are just, well, cleverer, because Europeans and cholesterol. (Presumably the observation that agriculture has not, in fact, ever been confined to Europe, would have spoilt the “logic” a bit.)

  156. David Marjanović says

    Me on February 21st…

    Hieroglyphic Luwian a₂-su “horse”

    That actually struck me as suspicious at the time. Apparently I was right: “horse” in HL is a₂-zu₂-, and while there is a word a₂-su, it means “stone”, “stone monument” and/or suchlike.

    (BTW, I think that paper’s main argument is half right, and a₂ was [ha].)

  157. David Eddyshaw says

    The Fat, yet Clever European thing reminds me something else I was reading lately:

    https://ia903400.us.archive.org/18/items/ebbesen-psy-ar-xiv-2020/Ebbesen_PsyArXiv_2020.pdf

  158. It never ceases to amaze me how many Scientifickal Studies are still dedicated to the proposition (however guilefully disguised) that Aryan Iz Best.

  159. David Marjanović says

    The real culprit for the “let’s just call everything ‘intelligence'” thing is this cited paper from 2018, to which I don’t currently have access.

    But this part of the new paper

    Interestingly, we also observe an increase in the genetic factors that lead to the development of coronary artery disease, which is related to a constant decrease in HDL cholesterol in European populations after the Neolithic revolution (Ali et al., 2012). If this adaptation causes disease, we could wonder why it might be evolutionarily advantageous to have lower HDL cholesterol concentrations. A reason for this could be related to cognitive functions since cholesterol is fundamental for the development and functioning of the brain.

    is just a blatant failure. Cholesterol is an important component of all cell membranes, and any brain contains a lot of cell membranes, so “cholesterol is fundamental for the development and functioning of the brain” in the same sense that, say, water is! It gives me the impression that the ten authors, two reviewers and one editor don’t know any biology above the molecular level…

    Some genetic polymorphisms in cholesterol-related pathways are connected to cognitive functions, while variations in the levels of HDL and LDL have been linked with alterations in intelligence, learning, and memory, although the full implications and the mechanisms are still far from being understood

    …so we’ll waffle about it for the fun of it.

    Since, some genetic polymorphisms in cholesterol-related pathways are connected to cognitive functions, in the last set of analyses, the evolution of genetic factors related to cognitive functions were estimated. Result obtained reveals that increased cognitive functions are an evolutionary advantage to adapt to the environment.

    No explanation, no citation, nothing.

    As for the changes in PRS for the cognitive traits we included in this study it is important to put these in perspective. The measure of educational attainment in years is largely a trait influenced by socioeconomic factors so the changes we observe only affect approximately 20% of the actual variation we observe in this trait as reported in the original GWAS by Okbay et al. (2016). For the fluid intelligence test performed in the UK-Biobank a similar heritability is estimated although the test itself only consists of 13 questions severely limiting the reliability of this specific test. The overall Intelligence reported by Savage et al. (2018) refers to a meta-analysis of various different tests that aim to capture overall intelligence and though the heritability is reported to be approximately 60% for intelligence this trait might be less accurate due to the heterogeneity of the tests included in the meta-analysis. Similarly, the measure of unipolar depression is also based on a meta-analysis by Nagel et al. (2018) that compromised various measures from the UK-Biobank the Genetics of Personality Consortium collected through the use of various questionnaires which means the reliability of this trait as measure of depression is hard to ascertain.

    In short, although we see an increase in PRS for cognitive functions over time this does not necessarily translate to an evolutionary pressure towards an increasing intelligence. What this means is that there is an increase in allelic frequencies for alleles that positively impact multiple different measures of cognition but only to a limited extent in relation with the heritability of these traits.”

    So the previous quote is completely useless, then.

    The closing quotation mark is in the original. I find that very interesting.

    From the technical point, data imputation was applied for handling missing data, which could limit the power of the study.

    …uuuuh… yes, it does.

    I should have mentioned the paper is in open access and has the names and affiliations of the editor and the two reviewers in the left sidebar.

  160. David Marjanović says

    It never ceases to amaze me how many Scientifickal Studies are still dedicated to the proposition (however guilefully disguised) that Aryan Iz Best.

    I don’t think that’s the case here. I think this is a lazy, unthinking paper: it takes a list of alleles that are “known to” correlate with anything, looks for them in sequenced ancient genomes of different ages, plots the frequencies over time and turns them into a Least Publishable Unit.

    Judging from the authors’ names, including at least one each of the two first and the two last and corresponding authors, few of them could have much interest in declaring any sort of “Aryans” The Best…

  161. has the names and affiliations of the editor and the two reviewers in the left sidebar

    “Or, you can just find them across the street at their usual seats at the bar. That’s where we got them in the first place.”

  162. David Marjanović says

    That’s how megajournals often work, except cyber.

  163. Trond Engen says

    Going back to the plague theme, there’s a new paper out locating the origin of the Black Death to modern day Kyrgyzstan:

    Spyrou, M.A., Musralina, L., Gnecchi Ruscone, G.A. et al. The source of the Black Death in fourteenth-century central Eurasia. Nature (2022).

    Abstract
    The origin of the medieval Black Death pandemic (AD 1346–1353) has been a topic of continuous investigation because of the pandemic’s extensive demographic impact and long-lasting consequences. Until now, the most debated archaeological evidence potentially associated with the pandemic’s initiation derives from cemeteries located near Lake Issyk-Kul of modern-day Kyrgyzstan. These sites are thought to have housed victims of a fourteenth-century epidemic as tombstone inscriptions directly dated to 1338–1339 state ‘pestilence’ as the cause of death for the buried individuals9. Here we report ancient DNA data from seven individuals exhumed from two of these cemeteries, Kara-Djigach and Burana. Our synthesis of archaeological, historical and ancient genomic data shows a clear involvement of the plague bacterium Yersinia pestis in this epidemic event. Two reconstructed ancient Y. pestis genomes represent a single strain and are identified as the most recent common ancestor of a major diversification commonly associated with the pandemic’s emergence, here dated to the first half of the fourteenth century. Comparisons with present-day diversity from Y. pestis reservoirs in the extended Tian Shan region support a local emergence of the recovered ancient strain. Through multiple lines of evidence, our data support an early fourteenth-century source of the second plague pandemic in central Eurasia.

  164. Two reconstructed ancient Y. pestis genomes represent …

    Yikes! Does this mean digging up ancient cemeteries is hazardous? Could they release more Yersinia pestis? To hitch a ride on passing fleas, to hitch a ride on passing rats.

    one of the largest infectious disease catastrophes in human history

    I think that should be in “recorded human history”(?) IIRC there’s been various ‘bottlenecks’ detected in human DNA that suggests there were several times when nearly the whole species was wiped out.

    Would the Covid-19 have had a similar effect if it weren’t for producing a vaccine so quickly? That is, if certain bloody-minded populations had had their way with resisting lockdowns/masks/etc?

  165. Lars Mathiesen says

    It took a while before the difference between “detectable traces of SaRS-II RNA” and “infectious amounts of complete virus particles” started to be reported, and in the scarier parts of the reporting I’m not sure they ever cared.

    (Or is that SArs? There is something funky going on with the capitalization in some institutions’ usage).

  166. David Marjanović says

    Does this mean digging up ancient cemeteries is hazardous?

    No, unless you have a mad scientist at hand who takes the reconstructed genomes from in silico [argh] to in vitro and then in vivo.

    Would the Covid-19 have had a similar effect if it weren’t for producing a vaccine so quickly?

    Unlikely, because it only ever killed a few % of infected people, as opposed to fully half.

    Or is that SArs?

    I’ve only ever seen SARS-CoV-II.

  167. Lars Mathiesen says

    Must have been the lowercase o from Co[rona]V[irus] echoing in the back of my brain, then.

    (SARS-CoV-II seems a bit redundant — Severe Acute Respiratory Syndrome, the one that’s a coronavirus. also the one that’s number 2. Or is it because MERS was also severe and acute and a coronavirus, so SARS-CoV-II is the second one that doesn’t happen to be called MERS? NGL, I thought the SA part was for Southeast Asian or something like that, as opposed to Middle Eastern)

  168. Good comic, but I find myself wondering when it first appeared, which led to the discovery that xkcd comics don’t appear to be dated, which is annoying. Or am I missing something?

  169. If you go to the archive, there’s a list of the comics by title; there it says to hover (I assume with the cursor) over the title to see the date. As I’m on my phone, I can’t test that.
    (It confirms my general view of mankind that an intelligent fellow like Randall Munroe nevertheless comes up with such a clunky solution for his web site.)

  170. Thanks, that works and it turns out the comic is 2020-3-2. But Jesus, what a crappy way of doing things.

  171. I have noted before that xkcd, unlike most comics, has a full transcript of each comic on the site. That means you can search for a given comic by name, text, date, or whatever; it’s all locatable in the metadata. That makes it really easy to find particular xkcd comics; most comic sites are much worse. (And that’s not even counting the wealth of additional information at explainxkcd.com.)

  172. January First-of-May says

    (And that’s not even counting the wealth of additional information at explainxkcd.com.)

    Indeed that’s the place I typically go to find out xkcd comic dates (most recently here).

    i don’t think I’ve ever seen SARS-CoV-II [sic] – in my experience it had always been SARS-CoV-2 (and AFAIK it got the name because it’s a very close relative of the original SARS-CoV; of course MERS could have gotten that too, but historically it didn’t, so the option was free).
    For a while I continued the mixed capitalization by spelling the disease name “CoViD-19”, but I can’t recall seeing anyone else doing that, and eventually I just switched to the more mainstream “Covid-19” and/or “the Covid”.

    (Unrelated fun fact: the official announcement of the disease was on December 31, 2019. I like to joke that it was a miracle to song writers that “Covid-19” rhymed with “quarantine”, and that if the announcement was a few hours later we would all have been talking about Covid-20 and that would have been far harder to rhyme.)

  173. @January First-of-May: I’m sure I agree that lyricists would have a hard time finding rhymes for “Covid-20.” In fact, I think there are plenty.

  174. David Eddyshaw says

    Well, there is plenty.

  175. Sunny. Funny.

  176. Yes, but it is hard to rhyme “covid-20” with “quarantine”.

  177. David Marjanović says

    SARS-CoV-2

    Oh! Yes.

    it got the name because it’s a very close relative of the original SARS-CoV

    Also correct.

  178. A trio of papers by Iosif Lazaridis et al., in a new issue of Science:

    Ancient genomes and West Eurasian history (a non-technical critique, Open Access),
    https://www.science.org/doi/10.1126/science.add9059#core-R1

    The genetic history of the Southern Arc: A bridge between West Asia and Europe
    https://www.science.org/doi/10.1126/science.abm4247

    A genetic probe into the ancient and medieval history of Southern Europe and West Asia
    https://www.science.org/doi/10.1126/science.abq0755

    Ancient DNA from Mesopotamia suggests distinct Pre-Pottery and Pottery Neolithic migrations into Anatolia
    https://www.science.org/doi/10.1126/science.abq0762

    A news item (in Dutch), with comments by IEist Alvin Kloekhorst:
    https://www.nrc.nl/nieuws/2022/08/25/de-taalsleutel-ligt-op-de-kaukasus-daar-ligt-de-oorsprong-van-veel-wereldtalen-a4139898

    Kloekhorst laments the lack of linguists among the team, and proposes a different scenario to fit with the genetics, which doesn’t require an unreasonably old date (8000 ybp) for the Anatolian split.

  179. Kloekhorst laments the lack of linguists among the team

    As do I. When will they learn?

  180. >The naming of the Southern Arc conjures a map projection that centers on the western tip of Eurasia rather than the Anatolian peninsula

    There are many things to be said about the focus, assumptions and predilections of those studying ancient genomes. But this isn’t one of them.

    Generally, the Science article first linked really wants to press a prehistoric moral relevance without giving much thought to where it leads. All in all, I’m quite glad Lazaridis do not focus more attention on the fact that the incoming men established their genetic preponderance in large part by raping the local women, (which, after criticizing L and co. for sanitizing things, they describe with the euphemism of ‘sexual violence.’).

  181. 8000

    I am a bit confused, because Lazaridis mentioned 5000 BC and 7000-5000 bp.

  182. Trond Engen says

    Thanks, Y. I’ll read, but it’ll have to wait.

  183. David Marjanović says

    The genetic history of the Southern Arc: A bridge between West Asia and Europe

    That, too, is in open access, and I’ve started reading it.

    From the “Structured Abstract”: “The Yamnaya expansion also crossed the Caucasus, and by about 4000 years ago, Armenia had become an enclave of low but pervasive steppe ancestry in West Asia, where the patrilineal descendants of Yamnaya men, virtually extinct on the steppe, persisted.”

  184. David Marjanović says

    That, too, is in open access

    Nope! The reason the page is so long is that there are several types of abstract, including a figure, followed by all the 483 footnotes (references to the “paper” and the “supplementary information”).

    The supplementary information, i.e. the actual paper, is in open access, though.

    Likewise for the two related papers in the same issue, except they have only one abstract and fewer footnotes.

  185. Poulsen and Olander, Indo-Iranian and Balto-Slavic. A presentation (pdf) in which they make the case (not overwhelming, but reasonable) for subgrouping the above two together.

    (This is mostly linguistics. I wasn’t sure which would be the infinite IE linguistics thread.)

  186. David Marjanović says

    Interesting, and they do take genetics into account.

  187. I guess behind this is a largely Graeco-Aryan model of PIE morphology, so that the many morphological correspondences between Greek and Indo-Aryan count as retentions.
    If you don’t assume that, their neat model stops working.
    Other assumptions that are at least debated:
    1) Original split Dat. Pl. in -*m- vs. Instr. Pl. in -*bh-: this weakens a BSl – Germanic isogloss
    2) The *sye/o-future is not convincingly attested for Slavic (the only form usually adduced here, OCS participle byšǫšt-, has also been explained otherwise)
    3) They downplay the evidence from Luwian (and ignore the evidence from Albanian) that PIE had indeed three velar series (OTOH, the idea that satem is a subgroup and that kentum-satem is not a primeval split is pretty uncontroversial nowadays)
    I’m not saying they’re wrong, and this certainly is a serious proposal, just how good it is hangs on a lot of assumptions that are not common opinion.

  188. David Marjanović says

    1) Original split Dat. Pl. in -*m- vs. Instr. Pl. in -*bh-: this weakens a BSl – Germanic isogloss

    There are good reasons to think that the distribution of *-m- and *-bh- was originally phonological instead of morphological: *-bh- after *n, *-m- otherwise. This concerns not just the dat. pl. & inst. pl. suffixes, but derivational ones as well.

    Of course this still means that BSl and Germanic generalized the same allomorph in the dat./inst. pl. that the other branches lost. But I’d think that can be attributed to the well-known contact that also spread such things as the definite/indefinite distinction in adjective flexion (which is accomplished by completely different means in BSl and in Germanic) and the *-uj- > *-ij- shift (which is also shared with Celtic, and I forgot about Italic, but that’s in another paper by the same author as it happens).

    They downplay the evidence from Luwian (and ignore the evidence from Albanian) that PIE had indeed three velar series

    I think that’s actually irrelevant to their model of satemization – though I haven’t understood all the details of that model and hope they get around to publishing it soon.

    the idea that satem is a subgroup and that kentum-satem is not a primeval split is pretty uncontroversial nowadays

    The latter, yes; the former, no – that’s what they’re trying to establish.

  189. Of course this still means that BSl and Germanic generalized the same allomorph in the dat./inst. pl. that the other branches lost. But I’d think that can be attributed to the well-known contact that also spread such things as the definite/indefinite distinction in adjective flexion (which is accomplished by completely different means in BSl and in Germanic) and the *-uj- > *-ij- shift (which is also shared with Celtic, and I forgot about Italic, but that’s in another paper by the same author as it happens).
    Don’t get me wrong – I’m not claiming that Germanic and BSl form a clade. My view is rather that there were several overlapping areal groupings – the ones you mention, satemisation, spread of Graeco-Aryan morphology (which hit Slavic more fully than Baltic), etc. By downplaying the other isoglosses, the authors arrange the evidence to support their cladistic model to the detriment of such an areal-based model.
    Concerning the three-velar series, they make a bit of an effort in their presentation to knock it down, so I assume that it actually would pose a problem for them. But let’s see what they will say on this in the future.
    the former, no – that’s what they’re trying to establish
    I’ve seen assumed that satem is a subgroup so often in various papers and dsicussions that I maybe underestimate how controversial that still may be. Although if they take satemisation as diagnostic for a clade and not as an areal feature, than yes, that Needs to be proven.

  190. David Marjanović says

    By downplaying the other isoglosses, the authors arrange the evidence to support their cladistic model to the detriment of such an areal-based model.

    Their assumption is that there is a tree, and that it can be found by identifying and peeling back (“downplaying”) all the areal layers; so that’s what they try to do.

    After all, a precisely simultaneous split into 10 clades is simply improbable. Perhaps we’re ultimately unable to identify the tree, but we definitely won’t if we don’t try.

    Although if they take satemisation as diagnostic for a clade and not as an areal feature, than yes, that Needs to be proven.

    Exactly. They’re trying to show (or at least presenting that they’re trying to show in a forthcoming paper) that it’s a clade and not (just) an area.

  191. Their assumption is that there is a tree, and that it can be found by identifying and peeling back (“downplaying”) all the areal layers; so that’s what they try to do.
    I don’t mind if they try. I just think it’s wrong to fit everything in a cladistic straitjacket. As you probably have guessed from my remark, for me, B-Sl is a clade, part of which (Slavic) later became more similar to Graeco-Aryan under Iranian influence.
    I also seem to remember that RUKI doesn’t even apply to the same degree in all Indo-Iranian, and that there’s partial RUKI outside of it – something that would be no surprise in an areal model, but needs to be explained in a clade model.

  192. David Marjanović says

    As you probably have guessed from my remark, for me, B-Sl is a clade, part of which (Slavic) later became more similar to Graeco-Aryan under Iranian influence.

    Poulsen & Olander clearly agree.

    I also seem to remember that RUKI doesn’t even apply to the same degree in all Indo-Iranian

    Nuristani just has RIK, though that’s been explained as a later partial reversal.

    something that would be no surprise in an areal model, but needs to be explained in a clade model

    Areal influence is available as an explanation in both models, the difference being just that common inheritance is not available to a purely areal model, so that the latter hypothesis must explain all shared innovations as either coincidence or areal influence.

  193. Nuristani just has RIK

    … and the surviving Baltic data has RK, though loanword evidence in Finnic shows that this, too, probably used to be RUKI; the most often-cited example is Lithuanian liesas vs. Finnish laiha > *laiša for ‘lean, thin’. (Related to the important point that *š from RUKI would not have been phonemic in any of the branches before various other changes.)

  194. David Marjanović says

    Feeding right into a pet peeve of mine: sounds can be easier to reconstruct in phonetic detail than the phonemes they belonged to, because sound changes (and borrowing) work on sounds, not on phonemes.

  195. David M. (November 6, 2021 at 6:28 pm): The genomic origins of the Bronze Age Tarim Basin mummies in open access.

    I don’t think I ever got around to reading this last year. I have now. I don’t have much more to say about the Tarim mummies except that it was unexpected and interesting. The Dzungarian EMBA genomes show beyond doubt that there was movement of Afanasievo pastoralists into Dzungaria, strengthening the Afanasievo origin hypothesis for Tocharian. Also interesting is the somewhat later Kanai MBA individual, from just over on the Kazakh side. It’s shown in this paper as a three way mix of Dzungaria EBA1 with Tarim EMBA1 and Baikal, but it plots as pure Okinevo. That seems to add further evidence that the Okunevo peoples started moving south as well, as briefly (and speculatively) dicussed here.

  196. First bioanthropological evidence for Yamnaya horsemanship, arguing for Yamnaya horseback riding c. 3000 BC based on skeletal evidence.

  197. >However, it is apparent that organized cavalry was introduced not before the very end of the second millennium BCE (4).

    Is there a distinction I’m not catching here? Horses were unused in organized armed warfare for nearly two millennia after riding began? Surely not. But what is the feature required before they’re willing to call it “organized cavalry”?

  198. Honestly, I understand nothing about it.
    Why did charoits dominate for a while and why did they become obsolete?

  199. Trond Engen says

    Organized cavalry means “strategic units of mounted soldiers trained and equipped for warfare from horseback”. I think the assumption is that it took long time to develop a breed of horses that could be taken into close battle without panicking and becoming unpredictable. Thinking of that, the chariot may also be seen as a way to constrain the movements of the horse.

  200. Trond Engen says

    The study is interesting and looks solid to this layman, but since it’s the first of its kind and (by design) limited to Yamnaya contexts, it’s not clear how widespread similar pathologies were outside of horse-using cultures.

    Edit: Or outside of that specific horse-using culture.

  201. Trond Engen says

    Horses for movement, for scouting and for managing herds. That’s a huge advantage over settled farmers — even without direct confrontations and mounted warfare. Perhaps it wasn’t really the Pontic-Caspians who overran Old Europe but their herds. But the herders may have given the farmers offers they couldn’t refuse.

    [Western movie theme playing]

    But this doesn’t really work out. There’s no value in a herd without anyone to consume the yield. The ranch economy of the American frontier couldn’t develop without industrialization, urbanization and the railway. No Yamnaya chief would need more cattle than could be consumed. Maybe we should think of the herd as a means of extortion. Farmers paid tribute to the herders (in produce and marriage) to spare their fields, and the herders gave some of it back by feeding their most loyal tributaries in conspicuous feasts.

  202. David Marjanović says

    Horses for movement, for scouting and for managing herds. That’s a huge advantage over settled farmers — even without direct confrontations and mounted warfare.

    Also, long-distance communication.

  203. Trond Engen says

    Yes, indeed.

    I keep thinking about herd size. The horse would presumably make it possible to manage larger herds with fewer people, but also to have more herders, allowing for even larger herds. But large herds need a large market to be profitable.

    Herd size to population size is an interesting measure. How many animals does it take to feed a person and a family? How many herders do you need per 100 heads of cattle (sheep, goats, etc.), and how many people can you feed? The difference can be converted to trade surplus and luxury goods (if there’s a market), maintaining an entourage or a court of soldiers and artists, buying allegience, or increasing the number of wives and children. Could the latter be the mechanism for male line replacement? Would it be fast enough to explain the turnovers suggested by ancient DNA?

  204. My understanding is that historically, cattle was much smaller and produced by far less meat and milk than modern cattle, so I guess herds needed to be bigger to feed the herder’s family.

  205. Trond Engen says

    Yes. But the economy is interesting. And how did the horse change it?

  206. David Marjanović says

    The difference can be converted to trade surplus and luxury goods (if there’s a market), maintaining an entourage or a court of soldiers and artists, buying allegience, or increasing the number of wives and children.

    Or maybe we should look at it the other way around: a social expectation to support the entourage and buy allegiance and support the largest family possible makes it necessary to have a large herd – heroes eat beef and pretty much only beef, even in the Iliad where they’re surrounded by fish –, and if anything is left after that, that’s where trade for luxury goods comes in. Most of the luxury goods are promptly invested in buying allegiance anyway.

  207. “develop a breed”, “constrain the movements” – @Trond, I thought about it too, but I don’t know how we can test the former idea…

    As for economy, I don’t know how to come to simple conclusions from the notion of suprlus alone. As I understand you, you’re saying that herding can be more efficient (labour, time).

    Your population reaches the maximum and surplus yok (but you have lots of free time) – unless the constraint on the size of population is imposed not by the land but by something else. So either you have this constraint or surplus is associated with inequality or it may appear dynamically due to instability of the system. Either way, it is not simple.

  208. I also don’t understand how gender roles work. Usually people assume (a) men are only good for warfare and food production (b) women are only good for child birth, breastfeeding (or otherwise raising children) and for farmers also food production (for some reason Europeans rarely discuss role women in production among herders). (c) the society as a whole is an extreme r-strategist, it tries to give birth to as many children as possible, so all young females are busy. Is all of this right?

  209. John Cowan says

    What is this yok of which you speak?

  210. so all young females are busy. Is all of this right?

    The modelling I’ve seen (sorry, can’t remember where) suggested hunter-gatherer ‘economy’ kept people ‘working’ only about 15 hours per week. Note this was an average, with wild variations: long hours at harvest or on hunting trips; idle when you were only waiting for the crops (or prey) to grow.

    The human physiology for storing body-fat suggests humans are adapted for a feast-or-famine sort of lifestyle.

    How would “young females” be kept busy by childcare?: babies just lie in their cribs and gurgle most of the time — it’s like leaving crops to grow.

    It isn’t ’til humanity gets to industrial society/trade surplus/luxury goods/(wage-)slavery that anybody needs to worry about producing more stuff than you can eat before it goes rotten. Surplus Value is the Economists’ term. I’m agin’ it.

  211. How would “young females” be kept busy by childcare?: babies just lie in their cribs and gurgle most of the time

    As the meme goes: tell me you’ve never done any real childcare without saying you’ve never done any real childcare.

    Even in a society like ours where kids spend a lot of time in school or otherwise under the care of non-parents “parents spend 50 hours per child per week in child care [averaged] over the first twelve years of the child’s life” (Schoonbroodt, A. (2018). Parental child care during and outside of typical work hours. Review of Economics of the Household, 16, 453-476. Emphasis mine).

  212. Trond Engen says

    Hunter-gatherer economies and lifestyles are so diverse that generalization almost becomes meaningless. We tend to think of the mobile never-keep-anything-you-can’t-carry-with-you type, but others were settled, using local resources and storing food for consumption between seasons. Ceramics was invented by hunter-gatherers. Others changed lifestyles between seasons, or kept the lifestyle but changed locations.

    As for the economy of the herders, I don’t claim anything, but it seems to have made it possible for a fairly small population to transform the landscape, the culture and the population of a continent in a few generations. I don’t think surplus of meat and dairy is enough to explain it, so what else did it yield?

  213. David Marjanović says

    How would “young females” be kept busy by childcare?: babies just lie in their cribs and gurgle most of the time — it’s like leaving crops to grow.

    You don’t have any younger siblings, do you.

    I don’t think surplus of meat and dairy is enough to explain it, so what else did it yield?

    Prestige and the plague.

  214. Stu Clayton says

    How would “young females” be kept busy by childcare?: babies just lie in their cribs and gurgle most of the time — it’s like leaving crops to grow.

    When ignorance is bliss, ’tis easy to be wise.

    A more charitable take would be that this view of childcare originates in a childhood of neglect. As Topsy replied when asked where she came from: “I ’spect I growed. Don’t think nobody never made me.”

    Circumspice, dude. Women break their balls taking care of kids, wherever you look. Even I know this, who never bothered about families or women.

  215. I don’t know any of the vast anthropological literature about family planning in various societies around the world, and how family size correlates with this or that mode of living, but certainly modern Europeans didn’t invent birth control. The contrast between survival and population pressure, or between mouths to feed and helping hands (and social prestige) is not a subtle thing.

  216. You don’t have any younger siblings, do you.

    Yes, I’m the eldest of 5. Also have lots of cousins. Elder siblings/grandparents do the childcare as much as mothers.

    “parents spend 50 hours per child per week in child care [averaged] over the first twelve years of the child’s life”

    So you’re suggesting my parents spent 250 hours per week? There aren’t that many hours in a week after allowing time to sleep (let alone my father’s mere 40-hour office job). Your numbers don’t pass the sniff test.

    And anyway that’s not a hunter-gatherer economy/is nuclear families. The point of the research (that I can’t now find) was that h-g’s had an easier life than in modern industrial economies. Who needs ballet lessons and soccer practice?

  217. I’m certainly not siding with antc’s “It’s like watchin’ the corn get taller.” At the same time, Schoonbrodt’s post-millennial American parents are out at the point where the bell curve flatlined in spending 50 hours a week on the kids *till they’re twelve.* That was my generation’s tv budget while mom and dad did other stuff.

    I still get texts about “playdates” for my 11 year old. I wanna be like “Why don’t you tell your kid to knock on our door every day at 3:45 like normal people?” But that’s not normal anymore.

  218. David Marjanović says

    Elder siblings/grandparents do the childcare as much as mothers.

    That’s a very different statement from “babies just lie in their cribs and gurgle most of the time — it’s like leaving crops to grow.”

  219. Trond Engen says

    Back to the horseriding study: I didn’t notice that a skeleton from Hungary with four markers of horseriding is very early, directly dated and supported by context to ~4300 cal. BCE, and believed on independent grounds to be an immigrant from the steppe to the Tisza culture. That’s centuries before the still inconclusive archaelogical evidence of horseriding from the Botai culture, about a millennium before the Yamnaya expansion, and even longer before any significant increase in horse remains. It’s a single instance (of riding, not of early immigration from the steppe), but if it bears out it’s wild. Wild and puzzling, Wild and puzzling and maybe not unreasonable at all.

    Let’s say the horse was used by steppe dwellers for long-distance journeys (trade? diplomacy? exploration?). That might explain how such a large portion of the Western steppe developed into a single culture in the fifth millennium BCE, but it could still have taken centuries of selective breeding before it was useful for managing large herds of cattle and sheep, allowing (or forcing) rapid and sweeping expansion across and out of the Steppe.

    Or it could indicate that the full story of interaction between steppe nomads and settled populations in the formation of the Yamnaya culture is not yet told.

  220. Trond Engen says

    I was hoping for Dmitry to say something about this, but here we are. Disclaimers apply.

    There’s a preprint on BioRxiv introducing a new tool for identifying close relatives in ancient DNA. The paper is partly a proof of concept, but since the samples they used are ancient Eurasian DNA, it also purports to add insights into the great migrations around 3000 BCE.

    Ringbauer et al (2023): ancIBD – Screening for identity by descent segments in human ancient DNA

    Abstract

    Long DNA sequences shared between two individuals, known as Identical by descent (IBD) segments, are a powerful signal for identifying close and distant biological relatives because they only arise when the pair shares a recent common ancestor. Existing methods to call IBD segments between present-day genomes cannot be straightforwardly applied to ancient DNA data (aDNA) due to typically low coverage and high genotyping error rates. We present ancIBD, a method to identify IBD segments for human aDNA data implemented as a Python package. Our approach is based on a Hidden Markov Model, using as input genotype probabilities imputed based on a modern reference panel of genomic variation. Through simulation and downsampling experiments, we demonstrate that ancIBD robustly identifies IBD segments longer than 8 centimorgan for aDNA data with at least either 0.25x average whole-genome sequencing (WGS) coverage depth or at least 1x average depth for in-solution enrichment experiments targeting a widely used aDNA SNP set (‘1240k’). This application range allows us to screen a substantial fraction of the aDNA record for IBD segments and we showcase two downstream applications. First, leveraging the fact that biological relatives up to the sixth degree are expected to share multiple long IBD segments, we identify relatives between 10,156 ancient Eurasian individuals and document evidence of long-distance migration, for example by identifying a pair of two approximately fifth-degree relatives who were buried 1410km apart in Central Asia 5000 years ago. Second, by applying ancIBD, we reveal new details regarding the spread of ancestry related to Steppe pastoralists into Europe starting 5000 years ago. We find that the first individuals in Central and Northern Europe carrying high amounts of Steppe-ancestry, associated with the Corded Ware culture, share high rates of long IBD (12-25 cM) with Yamnaya herders of the Pontic-Caspian steppe, signaling a strong bottleneck and a recent biological connection on the order of only few hundred years, providing evidence that the Yamnaya themselves are a main source of Steppe ancestry in Corded Ware people. We also detect elevated sharing of long IBD segments between Corded Ware individuals and people associated with the Globular Amphora culture (GAC) from Poland and Ukraine, who were Copper Age farmers not yet carrying Steppe-like ancestry. These IBD links appear for all Corded Ware groups in our analysis, indicating that individuals related to GAC contexts must have had a major demographic impact early on in the genetic admixtures giving rise to various Corded Ware groups across Europe. These results show that detecting IBD segments in aDNA can generate new insights both on a small scale, relevant to understanding the life stories of people, and on the macroscale, relevant to large-scale cultural-historical events.

    […]

    Discussion

    […]

    Detecting IBD segments in modern DNA has yielded unprecedented insight into the recent demographic structure of present-day populations, allowing researchers to infer population size dynamics [Browning and Browning, 2015, Al-Asadi et al., 2019], genealogical connections between various groups of people [Ralph and Coop, 2013, Han et al., 2017, Nait Saada et al., 2020], and the geographic scale of individual mobility [Ringbauer et al., 2017, Al-Asadi et al., 2019]. In principle, such analysis can also be applied to ancient DNA. It is particularly promising that the number of sample pairs that can be screened for IBD grows quadratically with sample size. The rapidly growing aDNA record together with this even quicker growth of pairs that can be screened for IBD will provide aDNA researchers with a powerful new way to address demographic questions about the human past. We believe that the method to detect IBD in aDNA presented here is only a first step towards creating a new generation of demographic inference tools, giving insights into the human past at an unprecedented fine scale.

    In the case of Corded Ware and descendants, the paper claims to settle the question of descent from Yamnaya itself rather than some Para-Yamnaya population. 75% of the genetic material is from a small group of people with close relatives buried in kurgans on the steppe. The remaining 25% is visibly recent ancestry from Globular Amphora. Both ancestries are evenly distributed in all descendant groups..

    As usual this raises (or leaves) as many questions as it answers. Here are two:

    The source of the R1a Y chromosome is still not found, and there aren’t many places left for it to hide. Where did it come from?

    The even distribution of Globular Amphora ancestry in Corded Ware groups from Sweden to Russia is hinting at a systematic, maybe even formalized, process of merger at the onset of the northern expansion. How did that play out?

  221. David Marjanović says

    Interesting indeed.

    10,156 ancient Eurasian individuals

    That’s… a lot.

  222. Trond Engen says

    Yes, it’s pretty much every ancient Eurasian individual with enough coverage. In this study they just looked at the Yamnaya expansion, but there must be much more in there. And as they say, the power increases quadratically with the number of genomes, so this is going to give increasingly sharp pictures of large scale movements and small scale interactions. The power would also increase with the completeness of the genomes, which is likely to improve as well, technology doing what technology does.

    They say they can’t discern the direction of a relationship (e.g. grandparent from grandchild), just identify it by genetic distance, but I wonder about that. With increasing coverage I think it should be feasible to identify up- and downstream genomes from splitting and recombination of long segments. But I’m no spesialist at all.

  223. I don’t like the abbreviation “aDNA.”

  224. Trond Engen says

    Neither do I, but for some reason nobody ever asks me before inventing terminology in a field where I have no business interfering.

    Well, ancIBD is even worse. it’s especially jarring as the first word in the title of the paper. (I put that “Well,” in front just to avoid it.)

  225. In Russian I usually hear “Battle Axe” (культура боевых топоров) rather than “Corded Ware” (культура шнуровидной керамики), e.g. in the context of Fatyanovo culture in my part of Russia.

    But English WP uses Battle Axe and Boat Axe synonymously (and the corresponding page in Russian WP is культура ладьевидных топоров, that’s Boat Axe). Is WP wrong?

  226. ancIBD is a Python package (but an ugly name).

    aDNA in the context of mRNA, tRNA etc. looks very weird…

  227. Trond Engen says

    Yes, it’s stridsøkskulturen in Norway too, while Denmark uses enkeltgravskulturen “the Single Grave Culture”.

    I’m not sure of the difference between “battle axe” and “boat axe”. My impression is that the latter is used east of the Baltic and that the words represent slightly different designs of essentially the same object.

  228. I am too tired to write the setup properly, but there’s a joke here about the German government research bureau responsible for studying the long-term maternal history of human populations. The punchline is “amtDNAamt.”

  229. stridsøkskulturen
    I first parsed that as strid-søks-kulturen and internally translated that as “battle-socks culture”. My brain plays tricks like that on me sometimes.
    In German: Streitaxtkultur, which according to WP is an now outdated designation for Schnurkeramische Kultur. It was still frequently used at least in popular books on prehistory when I became interested in the topic almost 50 years ago.

  230. David Marjanović says

    And before “pots, not people”, it was often Streitaxtleute

    Anyway, I’ve seen the equivalence stated, but I’ve also seen Streitaxtkultur claimed to be a subset of Schnurkeramik(kultur). I’m getting the impression archeological taxonomy is as messy as rank-based biological taxonomy.

  231. Trond Engen says

    Stridsøksfolket up here. I think it has made a renaissance in this new era of peoples, not pots.

    Snorkjeramisk does exist, but it’s used more for the continental forms and is not commonly known. Tellingly,, there’s no article about den snorkjeramiske kulturen on wp.no, and the article on stridsøkskulturen confuses it with den båndkjeramiske kulturen “Linear Pottery”.

  232. @Trond, does “east of the Baltic” refer to researchers or artefacts located east of Baltic?

    And do you use in Norway “battle axe” narrowly (to refer exclusively to a Scandinavian culture as in en.WP) or widely for Corded Ware in general ?

    I am quite confused by en.WP, because I am only familiar with the wider sense…

  233. John Cowan says

    I still get texts about “playdates” for my 11 year old.

    I solve this problem by ruthlessly ignoring all texts (except those containing authentication codes I have requested, and even then I’ll take them by robot-voice if that’s an option) and the great majority of phone calls sent to my cell phone. I scrupulously answer calls to my landline, read my email, and come to the door when my doorbell is rung. I consider that that is quite sufficient.

    I wanna be like “Why don’t you tell your kid to knock on our door every day at 3:45 like normal people?”

    And then what? I’m not quite prepared to send my kid and your kid off to the park at age 11, half a mile (0.8 km) away, by themselves. So playdates, yes.

  234. “I’m not quite prepared to send my kid and your kid off to the park at age 11, half a mile (0.8 km) away, by themselves.”

    I think Ryan was complaining about this bit, not about texts per se.

    Isn’t it strange how, at least in the US, crime rates have been going down since about 1991, but it’s become much less normal to send your ten-year-old with their friend to a park half a mile away?

  235. in the US, crime rates have been going down since about 1991

    Might we need to drill into those figures somewhat? My perception (from outside U.S.) is that random gun crime has been increasing. Also the number of people who are off their heads on drugs, and liable to lash out with no warning. (Those two categories maybe overlap.) Indeed aren’t they the sort of people more likely to lurk in parks?

    Perhaps the crime rates in parks have been going down (if they have) only because parents no longer send their kids unattended?

  236. David Marjanović says

    I’m not quite prepared to send my kid and your kid off to the park at age 11, half a mile (0.8 km) away, by themselves.

    …Back in the Cold War, when I was 6, I walked to school for a quarter of an hour every day on my own, and then back the same way. That was by no means unusual. ~:-|

    My perception (from outside U.S.) is that random gun crime has been increasing.

    Define “random”, and then look at the statistics – I think they’re all accessible from Wikipedia. There’s been a strange spike recently because somehow the pandemic made Americans and only Americans weirdly aggressive, but other than that it’s been going down for most definitions of “random” I can imagine right now. (I have to run, sorry.)

  237. look at the statistics

    Here (first graph) shows a steady increase in gun homicides since the 90’s — which was FJ’s start point. Yes there’s a spike over the pandemic, but the increase was there already. And of course U.S. gun homicide rates far outpace any other OECD country (later graph). There is a DoJ study saying it decreased 1993~1998 — which seems to be the nadir, just before that wp graph.

    By “random” I meant to exclude police/criminals killing criminals/police; family/close relationship violence where the victim is known to the shooter; but include recent school shootings where the victims might have vaguely ‘known of’ the shooter — since we’re talking about kids walking to school.

    Can’t find a specific analysis on wp. ‘Mass shootings’ graphed here vary wildly but show an upward trend since the 90’s. (spike in 2017~2018). BBC agrees.

    Figures for drug-related gun violence will be mostly police- or criminal-on-criminal, not off-their-head user on rando.

    it’s been going down for most definitions of “random” I can imagine right now

    Doesn’t look like it to me.

    somehow the pandemic made Americans and only Americans weirdly aggressive

    Oh, no. In NZ we had some very nasty violent incidents. Vigilantes setting up road blocks; other vigilantes demolishing them; stand-over tactics. And an occupation of Parliament’s grounds very much mimicking the assault on the Capitol. The difference was no guns. Although gun ownership in NZ is high in rural areas — of course I mean high relative to civilised countries.

    (Yeah from about age 10 I used to walk home about 40 minutes through a fairly remote bit of parkland. Once got thumped about the head by one of a bunch of ‘traveller’ kids — probably I was being an unbearable toff.)

  238. Random mass shootings, tragic as they are, are still so rare as to be statistically meaningless. If your kids are going to a park, the danger of being killed by a car on the way there is astronomically greater than of their being assaulted or abducted by a stranger. (Abducted by your non-custodial ex may be another story.)

  239. Lars Mathiesen (he/him/his) says

    It may even be that there is a greater risk of something bad happening if you drive the kid there than if you let her walk. (I’ll give better odds on that being true in Denmark where we build roads with sidewalks, than in the US).

  240. When we were kids, we used to play outside for hours; nobody controlled where we were between finishing our homework and supper. But it wasn’t without dangers; my little brother managed to almost drown twice.

  241. David Marjanović says

    look at the statistics

    Oh! My knowledge was out of date since 2015.

    When we were kids, we used to play outside for hours; nobody controlled where we were between finishing our homework and supper.

    Same here, 90s.

    Got stuck on a raft on a lake once (yes, in the city of Vienna).

  242. John Cowan says

    Isn’t it strange how, at least in the US, crime rates have been going down since about 1991, but it’s become much less normal to send your ten-year-old with their friend to a park half a mile away?

    Danger has gone way down; fear has gone way way up.

    And of course U.S. gun homicide rates far outpace any other OECD country

    Talking about “U.S.” homicide rates[*] in this context is as meaningless as talking about “European (including Russia)” homicide rates. In NYC (which is relevant), there were 1,927 homicides in 1993, 633 in 1998, 597 in 1998, 523 in 2008, 335 in 2013, 295 in 2018, 319 in 2019, 462 in 2020, 488 in 2021, and 433 in 2022. Meanwhile, the population rose from 7.3 million in 1993 to 8.4 million in 2018 (the bottom of the homicide curve) to 8.8 million, meaning that homicides per 100,000 (the standard statistical measure) went from 26.4 to 3.5 to 4.9. Hardly a crime wave.

    [*] Knife homicide kills you just as dead.

    …Back in the Cold War, when I was 6, I walked to school for a quarter of an hour every day on my own, and then back the same way. That was by no means unusual.

    So did I. But I wouldn’t have sent my daughter at that age (too irresponsible) or my grandson (his mother was too fearful).

    Figures for drug-related gun violence will be mostly police- or criminal-on-criminal, not off-their-head user on rando.

    Bullets often miss. I don’t live in a “nice middle-class neighborhood”.

    If your kids are going to a park, the danger of being killed by a car on the way there is astronomically greater than of their being assaulted or abducted by a stranger.

    Sure. But hanging out there together for hours would be another matter, which is why David’s point isn’t so relevant. And the main concern isn’t major crimes, it’s getting into one of various kinds of trouble that might or might not involve other people. I had at one point to rescue Dorian from an (illegally climbed) tree because he couldn’t get down by himself.

    Denmark where we build roads with sidewalks

    Here in St. Petersburg West, we do too.

    But it wasn’t without dangers; my little brother managed to almost drown twice.

    Just so.

  243. homicides per 100,000 (the standard statistical measure)

    Yes, that’s the measure I was using — or rather wp was. So the growth of population in NYC with a fall in that gun rate just means the rest of U.S. has got less safe. The OP didn’t specify where in U.S.

    Hardly a crime wave.

    If I might hold up a mirror to our U.S. friends:

    By any standards in OECD countries (excluding Russia [**], as you say), those figures are a crime wave. That nobody in U.S. cares enough to do much about it doesn’t stop it being a crime wave. We in NZ had one mass shooting of around 50 people by a ‘rando’. Within a month there had been dramatic changes to the law. Somebody I know with an all-official gun licence had the police all over him.

    Yes, it’s difficult to understand the objective risk of random shooting vs car accident vs etc. But since it’s a risk I don’t need to take, and my most unpleasant experiences in international travel were crossing the U.S. (not even trying to enter the country) from gun-toting border security, I now travel Europe to/from NZ only via Asia. Thank you.

    [**] I can’t see any grounds these days to count Russia in with Europe for any statistics.

  244. David Marjanović says

    That nobody in U.S. cares enough to do much about it

    Oh, they care. They just insist on canned solutions that are all blocked by the opposing party because all existing proposals for solutions have been completely politicized.

  245. They just insist on canned solutions that are all blocked by the opposing party because all existing proposals for solutions have been completely politicized.

    The second part of this sentence is correct, but I don’t know what you mean by the first: what “canned solutions”? This has been a problem for decades, and over that time pretty much every solution you could think of has been put forward. The problem does not lie with the people proposing solutions but with the people who will resist to the death any idea that even threatens to limit full access to guns for everyone everywhere. Please do not engage in both-sidesism.

  246. John Cowan says

    If I might hold up a mirror to our U.S. friends:

    This is no news to us, and the reasons are clear (Pinker spells them out with statistics). It’s a path-dependent result: in Europe, the pacification of society came before liberalization, whereas in the U.S. the order was reversed, so that private possession of weapons came to be seen as itself a liberty.

    Within a month there had been dramatic changes to the law.

    You inherited European notions of what counts as liberty. We didn’t. (You also didn’t have the historical experience, either in Enzed or in Britain, of having to defend your liberties by force.) Note that I am not defending the behavior of either my country or its citizens here, just pointing out the historical basis for their beliefs and consequent behavior.

    But since it’s a risk I don’t need to take, and my most unpleasant experiences in international travel were crossing the U.S. (not even trying to enter the country) from gun-toting border security

    Believe me, I have had the same experiences (except that I expected them and know how to behave in the presence of gun-toting etc., whereas you perhaps do not). So I don’t go anywhere, which suits me fine in the first place. I am Dennett’s Oxford lectern.

    what “canned solutions”? This has been a problem for decades, and over that time pretty much every solution you could think of has been put forward

    I think you and David M are in violent agreement: the solutions are already on the shelf, where they will remain for the reasons I gave above.

  247. “They just insist on canned solutions” sounds deeply pejorative to me, as if “they” were ignoring innovative solutions that DM is presumably willing to offer them. If he meant it the way you suggest, I’m sure he’ll say so.

  248. You inherited European notions of what counts as liberty. We didn’t.

    The UK (and most European states) had a civil war/revolution with actual shooting guns. I rather think it’s the Europeans who brought guns to the U.S. NZ (and especially Aus) suppressed native uprisings with gun violence. In our ‘Wild West’/pre-establishing Colonial government histories there were no gun controls. There is to this day high gun (rifles, not handguns) ownership in rural areas for killing vermin.

    So your “We didn’t” is a description, not an explanation. What different notions of liberty did Europeans bring to U.S., and why weren’t the European deaths in religious and civil wars remembered as a terrible warning?

    (You also didn’t have the historical experience, either in Enzed or in Britain, of having to defend your liberties by force.)

    English Civil War. (actually two of them) Persecution of Huguenots, French Revolution/The Terror.

    NZ Maori wars aka ‘Land wars’. NZ didn’t have white-on-white ‘wars’ as such, but there was plenty of gun violence against the colonial administrations trying to ‘regulate’ — that is, tax — sealers and gold prospectors.

  249. J.W. Brewer says

    There is quite considerable regional variation in the U.S. both in absolute homicide rate and in historical trends in homicide rate, including not just urban v. non-urban variation but metropolis-A v. metropolis-B variation. In New York City, the rate came down further from the early Nineties peak than the national average and has rebounded in recent years less dramatically, although still enough to alarm residents who got an unwelcome reminder that positive historical trends sometimes stop trending in the positive direction. Indeed, the NYC homicide rate has recently been below that of some of the more violent Canadian cities (typically out on the Prairies: Winnipeg & Regina etc.) In certain other large U.S. cities (e.g. Chicago and Philadelphia), by contrast, the decline from the early-Nineties peaks was shallower and the more recent rebound sufficient to wipe out most/all of the gains.

    High-level American-distinctiveness theories that purport to explain why America is more violent than Luxembourg or whatever often lack explanatory power for intra-U.S. variation.

    BTW, foreigners familiar with the classic song “Streets of Laredo” about cowboy-on-cowboy gun violence may be surprised to learn that Laredo, Tex. is one of the less violent cities of its size in the U.S. these days. https://www.lmtonline.com/local/article/Laredo-ranks-13th-safest-cities-in-2022-improves-17084308.php

    Indeed, although stereotypes about Texas-at-large have considerable global reach, there is quite considerable intra-Texan variation in this area among the largest cities – the homicide rate in Houston is more than quadruple that of El Paso (as of a few years ago, I could probably find stats a year or two more recent if I did more work than I have time for right now); that in Dallas almost quintuple that of Austin. That’s four of the six most populous cities in the state – the homicide rates for the remaining two, Ft. Worth and San Antonio, are sort of roughly in the middle of those extremes.

  250. David Marjanović says

    I’m too tired to be sure what exactly I meant by “canned solutions”, I’ll be back tomorrow. 🙂

    You inherited European notions of what counts as liberty. We didn’t.

    It’s not that simple. In Germany, on many highways, there is no speed limit, and that’s treated as a freedom issue more important than people’s lives just like guns in America. A general speed limit was introduced during the oil crisis, but abolished again very soon under the alliterating slogan Freie Bürger fordern freie Fahrt “free citizens demand free passage/driving” which was pushed especially loudly by the Liberal Party, i.e. the closest thing Germany has to libertarians. All more recent attempts to introduce a general speed limit faded out very, very quickly.

    There’s a stretch of highway close to Berlin that was so deadly a court found it legal to introduce a speed limit there; that was done, and now the death rate there is lower than the threshold the court set, so a court (I think the same one) found that the speed limit must be abolished again (and that’s about to happen, as far as I understand).

    As it happens, the scarcity of speed limits doesn’t kill as many people as the scarcity of gun regulations in the US (either in total or per capita); but still.

    There are even, sort of, equivalents to good guys with a gun. I’ve heard of an accident where a car crashed into the side of a truck at Ludicrous Speed, possibly 280 km/h. The people in the car ducked in time, and so they survived: the top half of the car was cleanly sheared off, the rest passed under the truck and emerged on the other side slowed down but without further damage. If the car had been slower, the whole car would have been crumpled at the edge of the truck, with the people in it. (…Or the crash wouldn’t have happened in the first place…)

    foreigners familiar with the classic song “Streets of Laredo”

    Never heard of it, FWTW.

  251. Never heard of it,

    I suggest you stay that way. Even if I had heard of it, I’d vehemently deny all knowledge. Horribly schmaltzy. Cowboy in white linen? Huh? Seems terribly impractical.

    … about cowboy-on-cowboy gun violence

    There’s only oblique hints of that in the actual words — at least in the version I found. (Gun violence, yes. Cause unspecified.) Was this a well-known incident ‘immortalised in song’? “primarily descended from an Irish folk song” says wp. The Irish has cause of injury unspecified “All by a young woman” (sounds like he knew he deserved it)/no mention of gun violence.

  252. John Cowan says

    If you look at the different versions of the song, beginning with “The Unfortunate Rake” (Wikipedia, Mudcat), it’s clear that the woman is a prostitute and the hero of the tale is dying of syphilis. There are dozens of versions. Here’s one of my favorites (only the first five lines actually belong to “Streets of Laredo”):

    As I walked out in the streets of Laredo,
    As I walked out in Laredo one day,
    I spied a poor cowboy wrapped up in white linen,
    Wrapped up in white linen as cold as the clay.

    I see by your outfit that you are a cowboy.
    You see by my outfit that I am one too.
    We see by our outfits that we are both cowboys.
    Now you get an outfit an’ you can be a cowboy too.

    Now you got no outfit, so you’re not a cowboy.
    I got two outfits. You take one of mine.
    Now you got an outfit, and I got an outfit,
    And in our outfits now don’t we look fine?

    You fit in my outfit. I fit in your outfit.
    We fit in our outfits. We’re outfitted fine.
    They’re fine-fitted outfits. They’re out-fitted fine-fits.
    They’re fit-outed b-b-l-l-r-r-b-b. I’m fit to be tied.

    As I walked out in the streets of Laredo,
    As I walked out in Laredo one day,
    I spied a young outfit wrapped up in his cowboy,
    Wrapped up in his outfit, so I let him lay.

  253. @J.W. Brewer: It is especially ironic that for some years now, El Paso has neen rated as one of the safest large cities in the country, while it’s right across from Ciudad Juarez, one of the most dangerous cities in the world.

    @AntC: The linen in “The Cowboy’s Lament” is bandaging. It’s really a wonderful cowboy song. My favorite recording is by Burl Ives, even though he doesn’t sing the verses that I like best.* It’s also a very influential song, much better known than the antecedent John Cowan mentioned, about leprosy instead.** References to its lyrics crop up in some possibly unexpected places. The chorus to “The Green Fields of France” by Eric Bogle runs:

    Did they beat the drums slowly?
    Did they sound the fife lowly?
    Did the rifles fire o’ ye as they lowered you down?
    Did the bugles sing “The Last Post” in chorus?
    Did the pipes play “The ‘Flowers of the Forest”?

    Moreover, many other performers, apparently unsatisfied with the extent of Bogle’s allusion, change the third line to a variation of, “Did they play the death march as they carried you along?” corresponding to the next like from “The Cowboy’s Lament.”

    * It’s not Ives’ best cowboy song either. That’s “Ghost Riders in the Sky.”

    ** “The Unfortunate Rake” has come to be associated with a St. James infirmary in New Orleans (where there is a St. James parish). However, that hospital apparently never existed. There may have been some confusion with a similarly named leprosarium in London.

  254. J.W. Brewer says

    @Brett: Similarly, the homicide rate in Nuevo Laredo, Tamps., right across the river from Laredo, Tex. and with similar demographics in many dimensions, has recently been orders of magnitude higher, as discussed here: https://www.wilsoncenter.org/sites/default/files/media/documents/article/Laredo_vs_Nuevo_Laredo_Dudley.pdf

  255. @J.W. Brewer: It occurs to me that for the Mexican (drug*) cartels, cities close to the northern border are probably more attractive territory to control than similarly sized cities farther away from American border. In fact, the proximity to America would seemingly be valuable for two separate reasons. The first is simply related to cartel business; controlling the area around a concentration of border crossings should be extremely lucrative for a smuggling enterprise. The second reason is that it gives the members of the cartel easy access to American consumer goods and to America itself. El Chapo ensured that some of his children were born in America, to get them American citizenship.

    * I was reading something today that claimed that the avocado industry in Mexico (or at least the transportation and resale parts of the supply chain) are also controlled by criminal cartels. However, it was not clear whether these were separate operations from the drug cartels.

  256. David Marjanović says

    …and if yes, how much longer they managed to stay separate operations from the drug cartels.

  257. most versions of Streets of Laredo i’ve heard include a line specifying “i’m shot in the breast and i’m dying today”. i’m not sure whether that’s a defining element of the cowboy branch of the Unfortunate Rake song family, but to my ear it’s part of what distinguishes it from Saint James Infirmary and other cousins from the gambler/reprobate branch, where i think the cause of death is generally not explicitly named.

    thanks for the outfit-centered version, JC! i’d never run across it, and it’s excellent.

  258. There was a 1960 Folkways Records album, composed entirely of songs from “The Unfortunate Rake” family. The seem to represent a pretty comprehensive cross section of the traditional repertoire, and there are also some more recent parodic* versions included.

    * The most famous parody is probably the Smothers Brothers’ version of “Streets of Laredo.”. I’m not sure whether the lyrics John Cowan quotes are an elaboration of the Smother Brothers’ words, or whether their version is an abridgment of the version he quotes.

  259. ktschwarz says

    Brett: Aha! That’s where this comes from, which I heard once on the radio decades ago:

    When I was a-skiing the hills of Sun Valley
    As I was a-skiing Old Baldy one day
    I spied a young skier all wrapped in alpaca
    All wrapped in alpaca and cold as der schnee

    Mystery solved!

  260. Ancient human DNA extracted from 20,000-year-old deer tooth pendant found in Denisova Cave:

    Scraps of ancient DNA coaxed out of a deer tooth pendant show it likely hung around the neck of a woman or girl around 20,000 years ago.

    We don’t know what she looked like, but she was related to a population of humans further east of Denisova Cave in southern Siberia, in which the pendant was unearthed.

    These insights into the pendant-wearer were made possible by a new technique — reported today in Nature — that can non-destructively extract ancient human DNA from objects made from porous materials like bone or teeth.

    The Nature article, Ancient human DNA recovered from a Palaeolithic pendant.

  261. From Denisova but not Denisovan

  262. David Marjanović says

    These insights into the pendant-wearer were made possible by a new technique — reported today in Nature — that can non-destructively extract ancient human DNA from objects made from porous materials like bone or teeth.

    I remember when this was thinkable but universally considered infeasible several times over.

    lo! these onescore years ago

  263. Dmitry Pruss says

    Very little DNA, just about 20,000 tiny snippets, but it was enough to determine sex (by the X-chromosome fraction) and to see close similarity of the genetic profile to a few other better-studied ancient Siberian humans.
    The authors tried the same trick on the artifacts from the older excavations but were stumped by the abundant modern DNA contamination from the digs where the archaeologists didn’t yet use gloves or masks. They also tried a more modern samples but extracted just a handful of short DNA molecules, not enough to conclude anything.
    Perhaps the pendant success will remain a singular exception…

  264. David Marjanović says

    but it was enough to determine sex

    And to determine it’s from a human and not from the deer!

  265. They got deer DNA too. Sometimes even the quality of species ID is marginal, like I recently read a paper on the horses of Gaya confederation, an early Korean state, where they proudly confirmed that DNA confirms that the teeth from the royal graves belonged to horses. It was based on a lone DNA fragment, and it still had three sequencing errors separating it from the clean match with the horse sequence…
    https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4435257

  266. Trond Engen says

    Another interesting study (and again thanks to Dmitry):

    Sandra Penske et al: Early contact between late farming and pastoralist societies in southeastern Europe Nature (2023)

    Abstract
    Archaeogenetic studies have described two main genetic turnover events in prehistoric western Eurasia: one associated with the spread of farming and a sedentary lifestyle starting around 7000–6000 BC and a second with the expansion of pastoralist groups from the Eurasian steppes starting around 3300 BC. The period between these events saw new economies emerging on the basis of key innovations, including metallurgy, wheel and wagon and horse domestication. However, what happened between the demise of the Copper Age settlements around 4250 BC and the expansion of pastoralists remains poorly understood. To address this question, we analysed genome-wide data from 135 ancient individuals from the contact zone between southeastern Europe and the northwestern Black Sea region spanning this critical time period. While we observe genetic continuity between Neolithic and Copper Age groups from major sites in the same region, from around 4500 BC on, groups from the northwestern Black Sea region carried varying amounts of mixed ancestries derived from Copper Age groups and those from the forest/steppe zones, indicating genetic and cultural contact over a period of around 1,000 years earlier than anticipated. We propose that the transfer of critical innovations between farmers and transitional foragers/herders from different ecogeographic zones during this early contact was integral to the formation, rise and expansion of pastoralist groups around 3300 BC.

    […]

    Discussion
    The genetic homogeneity observed in and across the four CA sites (PIE, YUN, PTK and VAR) of the fifth millennium BC matches the cultural homogeneity of the archaeological records and suggests an extended period of a relative stable sociopolitical network and absence of large-scale cultural and genetic transformations. Shared shorter IBD tracts between sites are consistent with the transregional connectivity visible in the material culture. We can only speculate about the reasons that led to decreasing settlement densities at the end of the CA. Conflict arising from an early expansion of supposedly ‘Indo-European’ groups from the steppe, an idea that was put forward by M. Gimbutas, is possible but internal competition and strife between CA groups is equally likely. In fact, given the near-identical genetic ancestry profiles of SEE CA groups, we caution that genetic analyses would be blind to internal conflicts, causing the replacement of one CA group by another. Long-lasting droughts and forest fires or infectious diseases and ensuing epidemics are other factors that could deplete lands. Indeed, evidence for early forms of Yersinia pestis as old as 5,000 years has been reported and even further back in time for Salmonella enterica for individuals associated with transitional foraging and pastoralism. Despite the systematic screening of teeth, we found no evidence for pathogens among the CA individuals of the fifth and fourth millennium BC, apart from two individuals (YUN048 and VAR021), who were positive for the Hepatitis B virus (HBV)52, while individual VAR021 was also positive for Salmonella enterica.

    A principal finding from our study indicates early contact and admixture between CA farming groups from SEE and Eneolithic groups from the steppe zone in today’s southern Ukraine, possibly starting in the middle of the fifth millennium BC when settlement densities shifted further north, connecting the lower Danube region with the coastal steppe and Cucuteni–Trypillia groups of the forest–steppe. Archaeological evidence shows that the early CA Gumelniţa groups had already settled deep into the steppe zone by the mid-fifth millennium BC, introducing elements of a farming lifestyle but also carrying cultural influences from local HG groups. The succeeding Cernavodă I and Usatove archaeological cultures were heavily influenced by local CA cultures and surrounds. During the fourth millennium BC, the northwestern Pontic region experienced intensified contact with Steppe Eneolithic groups, while these in turn also had contact with groups in the North Caucasus, such as Maykop, all of which are mirrored by the genomic data presented here. Moreover, despite the close geographical proximity of the Ukrainian sites studied, we were able to trace different admixture histories. Here, the heterogeneity of the individuals from the site Kartal stands out, which is located on the Danube delta at the northern end of the former distribution of the Chalcolithic Gumelniţa–Kodžadermen–Karanovo VI complex and thus represents the transformative nature and dynamics of the fourth millennium BC in action. By contrast, the more homogenous Majaky and Usatove groups, located north of the Dniester River, show that such assimilation processes had already occurred, suggesting that contact and exchange between transitional foragers and early pastoralist groups from the forest–steppe zone and non-local SEE farmer-associated groups had started already in the late fifth millennium BC. Moreover, variable cultural influences attested by the archaeological record are also traceable genetically. We argue that livestock, innovations and technological advances were exchanged through these zones of interaction, which then led to the establishment of fully developed pastoralism in the steppe by the end of the fourth millennium BC. Gene flow from both contact zones into the steppe could also explain the small amounts of farmer-related ancestry in the emerging Yamnaya pastoralists, which differentiates them from the Steppe Eneolithic substrate and accounts for subtle geographical structure in the vastly expanding territory/range.

    The early admixture during the Eneolithic presented in this study appears to be local to the northwestern Black Sea region of the fourth millennium BC and did not affect the hinterland in SEE. In fact, EBA individuals from the fourth and third millennia BC from YUN and PIE do not show traces of steppe-like ancestry but instead a resurgence of HG ancestry observed widely in Europe during the fourth millennium BC. This indicates the presence of remnant HG groups in various non-farmed regions, for example, highlands and uplands or densely forested zones and wetlands and a mosaic of ancestries rather than a genetically uniform CA and EBA Europe.

    While only a few tell sites have been resettled by local and/or incoming groups who did not originate in the North Pontic region, we can trace the appearance of migrants from the steppe, clearly attributed to Yamnaya culturally and genetically, in the local time transect at Majaky but also at Boyanovo in the Bulgarian lowlands of the Thracian Plain. The subtle differences in genetic ancestries between these two when compared to different Yamnaya-associated groups account for their geographical locations and different stages of genetic and perhaps, cultural assimilation. Two outlier individuals from EBA YUN and BOY bear witness to occasional admixture between inhabitants of EBA tells and incoming steppe pastoralists. Ultimately, the third millennium BC form of ‘steppe’-ancestry is expected to have reached the Great Hungarian plain, from where it diversified and spread further west. The interaction between local and incoming groups in SEE did not result in archaeologically visible conflicts or a near-complete autosomal genetic turnover as observed in Britain or a replacement of the Y-chromosome lineages in the Iberian Peninsula.

    Further integrated archaeogenomic studies are needed to disentangle the dynamics at play around the Black Sea during the formative periods of the admixture clines demonstrated in this study. High-quality genome-wide data from the fifth and fourth millennia BC that allow the direct tracing of IBD blocks shared by contributing groups will hold the key to understanding the population history of West Eurasia.

    Long quote. I’ve tried to get my head around it and boil it down to a couple of paragraphs, but it’s not easy. Here’s an attempt, mixed with my own interpretations:

    1. About 4200 BCE the Early European Farmer cultures of the Lower Danube collapsed for unknown reasons and disappeared from the record. No invasion is visible archaeologically, but internal conflicts are possible. It could even have been a revolution of sorts, with the elite settlements being abandoned and the agricultural economy continuing without the central powers.

    2. The Cucuteni-Tripolye megacities appeared before 4000 BCE, and there was also a gradual settlement of the Steppe by agro-pastoralists. This population is essentially Balkan farmers mixed with local foragers.

    3. In the 4th millennium BCE a distinct population developed on the Pontic coast between roughly Danube and Dniepr. This population is Cucuteni-Tripolye shifted slightly towards Maykop and local foragers. The neighbouring (Eneolithic, Pre-Yamnaya) populations of Khvalynsk, Sredny Stog and Steppe Maykop still show very little European Farmer ancestry.

    4. The Pontic coast and Steppe populations stayed distinct for about a millennium, though both populations interacted with Maykop and there’ a slow stream of Steppe genes into the Pontic coast..

    5. Around 3000 BCE the Steppe and Pontic coast populations merged and Steppe culture entered the Danube Valley. The Balkan Yamnaya is not uniform R1-a or R1-b, maybe owing to the Pontic coast element.

    6. At the same time there was a resurgence of recorded Early Farmer ancestry in the Balkans, especially south of the Danube. I think this could be because farmers again established organized hierarchic societies under pressure from the incoming Yamnaya, The authors say that the resurging European Farmers show increased Forager ancestry that points to an unrecorded survival of indigenous groups, but it might perhaps also be due to a retraction south of the Danube by more northern and eastern farmer groups.

  267. David Marjanović says

    It could even have been a revolution of sorts, with the elite settlements being abandoned and the agricultural economy continuing without the central powers.

    Like the collapse of the Maya cities. Interesting idea…

  268. Trond Engen says

    Well, I probably owe it to the Maya. The Maya and the Cahokians and Graeber/Wenslow.

  269. Trond Engen says

    I meant to say that the trade in metals that made the Varna elite so spectacularly rich may have been seaborn. Hence the location in Varna.

    I meant to sigh that we really need ancient DNA from Crimea. Also from sunken vessels in the Black Sea.

  270. PlasticPaddy says

    Aphrodite was seaborn, but trade is usually seaborne. Ok also sigh, so someone was dictating a post😊

  271. Trond Engen says

    Nope. I’ve never dictated a comment yet. “Seaborn” was an error. Not sure if I misspelled or thought autocomplete would finish for me. “Meant to sigh” was carefully crafted in contrast to the first paragraph.

  272. David Marjanović says

    Also from sunken vessels in the Black Sea.

    Most of it has actual ocean floor and is some 3000 m deep, IIRC.

  273. Well, yes, sigh.

    But I wonder how volume of maritime trade changed over time.

    Somehow when you say “vessels” I imagine bulk carrier loaded with grain Greek or Phoenician ships from the Iron age, but what could prevent them from doing it 2000 years before that? The technology must be older…. (and I’m not even speaking about bronze, whose components were hardly widely available)

    Or are large ships a recent invention?

  274. Trond Engen : The older golden treasure burials were actually inland from Varna, around Provadia. As in the Varna necropolis is larger, but the Provadia one is older. The theory is that it was related to the salt trade with south of the mountains.

  275. January First-of-May says

    Or are large ships a recent invention?

    Not as late as the Iron Age, at least. The famous Uluburun shipwreck is from well into Late Bronze Age, and IIRC there are some less well preserved shipwrecks that are even older.

    And then there’s those semi-infamous supposedly-predynastic petroglyphs featuring long boats with lots of warriors…

  276. Aha: 10 tons of copper, 1 ton of tin, 1 ton of resin….

    …and “black cumin”. In this case nigella.

    PS: “The Dokos shipwreck is the oldest underwater shipwreck discovery known to archeologists.[1] The wreck has been dated to the second Proto-Helladic period, 2700–2200 BC.”

  277. Trond Engen says

    On petroglyphs and early Scandinavian sailing rowing paddling ships.

    Short summary: A new scientific consensus is emerging, saying that the petroglyphs depict real ship travel rather than mythological concepts. There are petroglyphs of really large vessels that are almost 4000 years old, and there are found paddles of the right type in Northern Norway that may be even older. One archaeologist is quoted as saying that Scandinavians with these vessels could have traded directly with the Mediterranean.

    Scandinavia got connected to the world trade fairly late. I would think the technology spread here from the south.

    Other interpretations are possible. The oldest depicted ships may have been foreign trading ships visiting regularly, and the petroglyphs may be expressions of a cargo cult. That doesn’t change the age or size of the ships and the trade.

  278. “I would think the technology spread here from the south.”

    I suppose, lack of connection is an argument.
    Spread “from the center to periphery” is but one pattern, which is not observed universally, and Scandinavia may have the right conditions…

    “cargo cult”
    Long ships came from the south and brough prestige consumption goods, silver and gold. And shining discs descended from the sky and brought women. And men began to think, and they were thinking three thousand years and one day. And three thousand years later the wisest of them stood up and said: let’s build a long ship and capture a flying saucer and go where the goods and women are brought from and take more.

    So there were two brothers, one went to where prestige consumption goods are brought from and conquerred the land and was the king there, and the other went to Betelgeuse.

  279. David Marjanović says

    One archaeologist is quoted as saying that Scandinavians with these vessels could have traded directly with the Mediterranean.

    Certainly bolsters the idea that the thick layer of pre-Grimm Celtic loanwords in Proto-Germanic came in across the sea and not on foot.

  280. Don’t blame me, I chuckled but posted for you enjoyment. Supposedly PIE is now 2 thousand years older than it was yesterday?
    “Language trees with sampled ancestors support a hybrid model for the origin of Indo-European languages”
    https://www.science.org/doi/10.1126/science.abg0818

  281. Sigh. I haven’t seen the article, but just like AI: until they make the algorithm show its work, they are playing with their poo.

  282. Trond Engen says

    We find a median root age for Indo-European of ~8120 yr B.P. (95% highest posterior density: 6740 to 9610 yr B.P.). Our chronology is robust across a range of alternative phylogenetic models and sensitivity analyses that vary data subsets and other parameters. Indo-European had already diverged rapidly into multiple major branches by ~7000 yr B.P., without a coherent non-Anatolian core. Indo-Iranic has no close relationship with Balto-Slavic, weakening the case for it having spread via the steppe.

    This sounds to me as if they’ve loosened the constraints so much that they end up assuming a conclusion of no constraints. But I’m no Bayesian.

  283. David Marjanović says

    There are Kümmel and Kim among the authors, and the abstracts look fine, except for the dismaying information that the dataset is entirely lexical. Where do people get the idea that phylogenetics can’t be done on morphological or phonological characters?

    The replacement of numerous inherited words in Indo-Iranian has caused long-branch attraction between Indo-Iranian and the root before. I wonder if that’s why this paper doesn’t find Indo-Slavic.

    Reading it now…

  284. David Marjanović says

    I’m not; lack of access. I’ve downloaded the supp. inf., though.

    I wonder if making the ancestry constraints optional ruined everything.

  285. @DM, is your address at gmx dot at good for sending a copy?

  286. David Eddyshaw says

    except for the dismaying information that the dataset is entirely lexical

    Not so much dismaying as altogether vitiating.

    In Oti-Volta there are at least two languages which are bizarrely divergent lexically (Nawdm and Hanga) which have quite certain close relatives (respectively, Yom and Dagbani) whose lexicon is perfectly proper and obediently aligns with the real criteria for subclassification, viz shared non-trivial innovations.)

    Do they make Armenian an Iranian language, or does their technique manage to avoid such chestnuts? (can’t be bothered to look at flat-earther nonsense like this myself … I’ve got better ways to occupy my time.)

    The only interesting aspect of this tosh is how it gets published. The sociology of science.

  287. The ~100 page supplementary information explains their methods in good detail, and is bound to be less glib than the article. At a glance, the crux of the difference between their date and that of Chang et al. is that they changed the particular tree topology imposed by the latter.

  288. It seems like they took all the P.I.E. origin theories, including Euphratic, and wondered why couldn’t they all be true.

  289. “Do they make Armenian an Iranian language, or does their technique manage to avoid such chestnuts?”

    Yes, they avoid that. The lexical data set is, as far as I can tell, very good and well-vetted, and based on assessing things as a linguist would. The problem, of course, is that it’s still a lexical data set.

    Or one of the problems. There are plenty of others. Here are some I shared on a FB group a couple days ago:

    -It’s a little weird to see the raw Anatolian hypothesis taken so seriously, when I don’t think even Renfrew believes in it now, and few linguists ever gave it any credance. I guess this is for rhetorical punch (maybe necessary in a glitzy journal like Science), but it’s still strange to see.

    -Phylogenetic splits are still(!) assumed to correlate with ‘migrations’, and the possibility of deeper dialectal divergences within a ‘homeland’ area is not taken seriously.

    -Not addressed (in the main paper; haven’t read the supplementary materials yet) is the possibility (probability?) that rates of change are not just variable, but systematically biased during the earlier histories of most branches due to initial language contacts during the primary spread period(s). These may have been more extensive, and led to more rapid vocabulary turnover, than models whose chronological parameters are based in later periods will anticipate. I don’t see how any phylogenetic (fancy glottochronological) dating method can really deal with this, and it remains an unsolved (unsolveable?) problem in all computational chronological estimates.

    -This study is better than some in regularly giving date ranges, but too often median dates are presented too strongly, sometimes with consequences. For instance, on p. 8 it says:

    We also find that Indic and Iranic had diverged from each other already by ~5520 yr B.P. (4540 to 6800 yr B.P.). To reconcile this with a steppe origin would require an alternative scenario in which Indic and Iranic split from each other approximately two millennia before entering South Asia and Western Asia.

    This is clearly taking the median date as ‘the date’, and ignoring the full millennium later that should be given as just as reasonable a possibility at the younger end (younger and older edges are what matter when talking about interpretative options, not the middle dates!). The gap is then exaggerated a bit on archaeological grounds by focusing on presence of steppe ancestry in the Indus region as the benchmark for ‘entering South Asia and Western Asia’, which is my first problem repeating itself. If Proto-Indo-Iranic is associated with, say, Sintashta (a rather popular idea that should be taken seriously), then there is a gap of only a few centuries between that and the lower edge of the divergence between Indic and Iranic produced by this model. Maybe still a point to address, but it lacks the rhetorical punch gained by latching on to the median date and trying to anchor linguistic divergence only in dramatic migrations.

    -The wheel-vocabulary is, how should I put it, not treated with nuance. The idea that parallel semantic change is sufficient is not convincing. The discrepancy between Anatolian and the other branches is potentially very significant, but not discussed. The exact significance of it for dating is, as usual, not treated very well (again, if there are early dialect divergences *within* the [secondary?] ‘homeland’, this could be consistent with the dates of their linguistic model, but are no help in getting an earlier dispersal of speech communities — the kind of dispersal archaeologists are interested in).

    (As I said, I haven’t gone through the supplements yet, so this is just on the published article.)

    (Posting again, since my original version seems to have been eaten.)

  290. David Marjanović says

    @DM, is your address at gmx dot at good for sending a copy?

    Yes 🙂

    -It’s a little weird to see the raw Anatolian hypothesis taken so seriously, when I don’t think even Renfrew believes in it now, and few linguists ever gave it any credance. I guess this is for rhetorical punch (maybe necessary in a glitzy journal like Science), but it’s still strange to see.

    I think it’s the other way around: they got a date that was OVER 9000, figured “huh, that fits Renfrew’s old idea and the first Nature paper”, and were merely honest in mentioning it.

    Not addressed (in the main paper; haven’t read the supplementary materials yet) is the possibility (probability?) that rates of change are not just variable, but systematically biased during the earlier histories of most branches due to initial language contacts during the primary spread period(s).

    That’s a very good point. I think some programs for molecular dating can take this into account, but I’d be quite surprised if that was done here. (Haven’t read the supp. inf. yet either.)

    younger and older edges are what matter when talking about interpretative options, not the middle dates!

    This is a Bayesian analysis, though. The “edges” are edges of the 95% confidence interval (or perhaps some other confidence interval) of a distribution, by default a bell curve; the chance that the “true” date lies close to the median is much higher than that it’s close to an edge. In other words, the reasonable scenario comes out as possible-but-noticeably-unlikely.

  291. Yes, I meant the edges of the 95% confidence range, which is already about as precise as it’s safe to get with this sort of thing. Of course if you limit things to within that range, you can discriminate meaningfully, but the whole point of using a confidence interval is to draw attention to the reasonable range of precision in question. An edge date is already very likely. It’s the vastly longer tails beyond that that we can be a bit more skeptical about (granting everything else, which we of course shouldn’t), though even then it’s possible to be too dogmatic.

  292. John Cowan says

    Yes, being dogmatic in one’s skepticism is always a thing to avoid.

  293. Dmitry Pruss says

    In the supplement they explain that the rate of change in their model varies with time and branch, unlike in conventional glottochronology. But I believe that there isn’t enough linguistic data to narrow down the range of credible rates of change in many branches and periods, and therefore, if dome properly, a model of this sort would give winder and wider confidence intervals owing to increasing rate uncertainty, instead of ruling out earlier timing hypotheses. The fact that it’s claimed to rule them out, therefore, must be attributed to inclusion of additional genomic data, which is, on occasions, questionable too.

  294. jack morava says

    Carleton Hodge, Anthropological Linguistics 23 (1981) 227 – 244, Indo-Europeans in the Near East

    suggests the Nile Valley \circa 20,000 BP as the UrAryan homeland. I think this might fit with

    https://en.wikipedia.org/wiki%C3%B6bekli_Tepe

    as a waypoint. Not to mention the Black Sea Flood and the Younger Dryas …

    [Major Motion Picture Rights Reserved]

  295. p. 57 of the supplement gives a consensus tree, which I find easier to read. Even before dealing with the PIE split, there are some other eyebrow-raising dates: Germanic and Celtic splitting at about 3,000 BC, and that branch splitting from Italic at about 3,700 BC. Is there anything in those dates, specifically, that is hard to reconcile with other knowns (e.g. archaeology)?

  296. Also, Ossetic is shown to split frim Iranic (-Avestan) at about 1500 BC. The Scythians didn’t get to where Ossetia is now until about 1,000 years later, right?

  297. David Eddyshaw says

    rates of change are not just variable, but systematically biased during the earlier histories of most branches due to initial language contacts during the primary spread period(s)

    I’ve wondered about this with Western Oti-Volta, the current geographical spread of which must surely be connected in some way with the expansion of the Mossi-Dagomba kingdoms, Trouble is, that that expansion must have begun less than a millennium ago, and the WOV languages seem surprisingly divergent if that’s the time scale. They look about as divergent as the Romance languages.

    The WOV languages are pretty stripped-down morphologically, at least as far as verb conjugation goes, which would fit a scenario of large-scale uptake of WOV languages by speakers of other languages, I suppose.

    I suppose the other aspect is that the WOV languages may already have been fairly divergent even before the Mamprussi took to their career of world conquest, bringing the languages along.

    I can’t think of any way of making all this anything more than pure speculation, unfortunately. And I’m not sure that there is even anything there that really needs actual explanation, given the basically pseudoscientific nature of all efforts to get absolute chronolological dates out of rates of lexical change.

  298. David Marjanović says

    suggests the Nile Valley \circa 20,000 BP as the UrAryan homeland. I think this might fit with

    Oh, that’s the outlier that put the IE homeland in the Nile delta on a map of proposed IE homelands that I once saw!

    Twenty thousand BP is the peak of the Last Glacial Maximum. It does not fit with Göbekli Tepe, which is half that old, nor with the Black Sea flood or the Younger Dryas, which are even younger. It does not fit the fact that nothing IE-like is known from Egypt or surroundings any closer than the northern fringe of Syria, it does not fit the lack of genetic or archeological evidence for a migration from there, it does not fit even the slowest models of language change (like the one we’re talking about right now)…

  299. In short, it’s nuts, as is anything that takes it seriously.

  300. David Marjanović says

    So I’m reading the Guide to the Supplementary Information.

    2.2 LANGUAGE DATA ENCODED IN BINARY FORMAT: NEXUS AND .XML FILES
    So that they can be used by quantitative and phylogenetic analysis software, the language data in the IE-CoR raw data tables need to be converted into a dataset in a format that such software can process. In particular, most such analyses require the data to be encoded as binary data characters, i.e. with the presence/absence of each cognate set in each language taxon expressed in binary form: 0=absent, 1=present.

    And then I looked up the specific meaning of a Russian swearword I had first seen a few days ago. (Wiktionary has everything, it seems.)

    “Most such analyses”!?!?! Not the tip-dated phylogenetic analyses, constrained or otherwise. I’ve only used the BEAST once, in a course, so I don’t know how many states exactly it can deal with, but… it’s meant first of all for molecular data. It can inevitably deal with four states per character, and I’d be surprised if it couldn’t deal with 23. Probably 32 or 36* or 64 or something, quite possibly more.

    Coding the states of each character as characters is the basic fundamental mistake that Gray & Atkinson made in the paper that founded this entire field. It made their branches too long.

    The constraints used in the 2015 paper should have made this irrelevant for most purposes, and apparently did. But now they’ve relaxed the constraints.

    * 0–9 followed by A–Z

  301. And did you get the main PDF, DM? Sent it over but from a gmail address, sometimes the emails from gmail are just snagged unseen by spam filters

  302. David Marjanović says

    Oh. There is:

    Sensitivity analysis SA10, which used a recoding of cognate sets not in binary but in multistate terms. That is, each cognate set is taken in this analysis not as a binary character either present 1 or absent 0, as in all other analyses, but as one state of the corresponding IE-CoR meaning, each taken as a multistate data character (170 in total). This entails no binary data block in the .xml file, but a completely different data structure, converted into the multistate format in the .xml file specific to that analysis.

    So why did they do the binary ones at all…

    Sensitivity analysis SA10 used a BEAST2 add-on package newly written by Benedict King to implement a multistate phylogenetic analysis model, the code for which is available at: https://github.com/king-ben/ConceptModels.

    what

    I checked: BEAST can use continuous data, i.e. at least 100 states per character, likely more.

  303. They did a lot of robustness checks. The only thing which made PIE earlier was imposing Chang et al.’s tree prior.

  304. David Marjanović says

    Haven’t checked my e-mails yet, will do that “tomorrow” – it’s half past midnight.

    Here’s the part of the supp. inf. concerning sensitivity analysis 10:

    7.10.2 RESULTS
    Our multistate model produces root age estimates distinctly younger (by 2057 years, or 25.1%) than any produced by the covarion model: 6153 BP (4926–7884 BP).

    Well, well, well.

    In tree topology, the multistate results show a similar lack of resolution at basal nodes of the phylogeny, although there remains strong support for a European clade of Germanic, Celtic and Italic, and for a nesting of Nuristani within Indic. In the following sections we seek to determine why the multistate produces these different results, and then to assess the relative performance of the multistate and covarion models. Although this analysis does reveal sensitivity of the results to model choice, there are three principal arguments (detailed below) for preferring the covarion model to the multistate model.

    […]

    One reason for these younger node ages across the phylogeny is that the multistate model reconstructs less innovation on terminal branches, and in particular the branches to non-modern languages, which are central to time calibration. Using phylograms — i.e. trees with branch lengths expressed as the amount of lexical change rather than time — as output by BEAST2, we compared the lengths of all terminal branches.

    Terminal branches to non-modern languages are in general significantly shorter (on average, by c. 70%) than terminal branches to modern languages. This is an entirely expected result from the IE-CoR analyses, given the structure and diversity of the major branches within Indo-European, and the patterns of which ancient languages survived with rich enough documentation.

    We found a noticeable difference between the covarion and multistate analyses. In the main covarion analysis, terminal branch lengths to non-modern languages are shorter than those to modern languages by a ratio of c. 0.29:1. In the multistate analysis, however, they are shorter still: c. 0.235:1 (Fig. 7.10.4). And since non-modern languages are the basic source of time calibration, this explains the younger node age estimates in the multistate analysis.

    Good news, isn’t it? No:

    To investigate this discrepancy between the models and their results, we evaluate which performs best on a series of qualitative criteria in the following sections §7.10.4 to §7.10.7. We first assess their results with respect to known language histories (§7.10.4) and tree topologies (§7.10.5), and then identify two particular forms of misspecification in the multistate model (§7.10.6 and §7.10.7) which both lead it to underestimate time-depths. In particular, we assess this by using ancestral state reconstruction in BEAST2 to pinpoint which particular linguistic innovations were inferred to occur on terminal branches to non-modern languages in both the covarion and multistate analyses.

    7.10.4 UNREALISTIC AGE ESTIMATES FOR KNOWN HISTORIES
    The very last branches towards the tips of the output trees represent the splits between the present-day language varieties most closely related to each other. Importantly, these splits in the phylogeny by no means represent any breakdown in mutual intelligibility; rather, they must by definition have arisen already by the time of even just the first lexical difference between two language varieties, in any single meaning within the IE-CoR 170 meaning set (see §8.3 below). Given this, the covarion MCC tree itself shows a tendency for the time-depths of the splits before the tips to be rather too shallow, with respect to the known histories of those varieties over recent centuries. That is, many varieties were already significantly distinct from each other, as visible in written documents in those varieties, already a few centuries before the split date returned in the MCC. This tendency is markedly exacerbated in the MCC tree from the multistate analysis. Branch-lengths here are extremely compressed towards the tips of the tree, leading to distinctly unrealistic underestimates of the time-depths of the splits.

    Luxembourgish and standard (High) German, for instance, are returned in the multistate MCC as an
    undifferentiated single variety until as late as 1650, Czech and Slovak until 1770, European and Brazilian Portuguese until 1926. In Italy, Standard Italian and Milanese are returned as not divergent until 1775, Friulian and Neapolitan not until 1625. In fact, what became standard Italian was first effectively codified c. 1300, in large part by Dante on the basis of the speech of his native Florence, and in deliberate contradistinction to other regional descendants of Latin elsewhere in Italy, very diverse already in Dante’s day. The multistate MCC also returns Dutch and Frisian as a unitary lineage as late as 1722, a millennium or so too late. Overall, outside the fixed calibration points, the multistate model seriously underestimates time-depths widely across the tree, where they can be set against the evaluation criterion of historical documentation over the last millennia or two. This undermines confidence in the overall chronology of the tree, and implies some serious model misspecification within the multistate model as currently implemented.

    Fair enough, but that doesn’t make the other analyses any better!

    Also, if you have Standard German in a phylogenetic dataset, facepalm yourself. Hard.

    The next section explains that the topology of the tree from the multistate dataset is also clearly wrong. For example, “Tocharian is relatively deeply nested within the tree, forming an unexpected common branch with Albanian (albeit also with support as low as 0.35), and further back also with Greco-Armenian (0.32).” Also: “That Greek and Armenian may form a common deep clade is widely entertained within Indo-European linguistics, even if still hotly contested. What is clearly not plausible, however, is the implication from the chronology of the multistate MCC tree, that Greek and Armenian are more closely related to each other, with a more recent common ancestor, than Indic and Iranic, for instance.” Worse, Tocharo-Albanian comes out as younger than Indo-Iranic. And these are just the most blatant examples.

    7.10.6 and 7.10.7 explain the too young ages: they’re not a property of the dataset, they’re a property of the weird multistate add-on which can’t handle polymorphism and therefore consistently estimates too high rates of evolution. “Flawed Program Produces Wrong Tree, So We Used Flawed Dataset Instead of Flawed Program” may not be too long for the title of a Science paper, but it’s a strange course of action.

  305. Dmitry Pruss says

    Cool. So using just presence / absence of a cognate as a binary (the main method in the paper) creates anachronisms, placing 2-3 thousand years old population events at a lot earlier dates not compatible with the archaeological data.
    But using a large number of states of the cognates instead of mere presence /absence results in considerably younger dates, especially for the splits which are merely centuries old. It’s probably tied to the situation where minor variations in the cognate forms are very well known for the recently diverged languages, resulting in the inflated estimate of the rate of change. My question is, why it isn’t equally inflated in the recent vs. ancient languages? Inflated compared to the past, apparently, because in the historical data, variability / polymorphism of the cognate forms is less known? Or the real reason is that the variability on the recent timescale is sort of like ripples on the water, now here, now back there, and it gets smoothed out in the longer term evolution of languages?
    The flip side of the question is, why does the binary presence-of-a-cognate lead to a deflated estimate in the recent languages as compared with the ancient ones. Is it because retention of cognates is overreported in the extant languages?

  306. jack morava says

    @ DM above, I agree of course.

  307. Well, well, well.

    Aha. (Russain “aha” that is yep)

  308. “we evaluate which performs best on a series of qualitative criteria in the following sections §7.10.4 to §7.10.7.”

    They are reinventing glottochronology.
    Does their model contribute anything to glottochronology?

    I understand that they deduce the phylogeny, but the claim posted here was about dating. And their dating method (which takes the phylogeny and accumulated innovations as the input) seems to be glottochronology.

  309. In the supplement they explain that the rate of change in their model varies with time and branch, unlike in conventional glottochronology. ” – But: https://en.wikipedia.org/wiki/Glottochronology#Modifications

    Of course by “conventional” you may mean “not this”. But they say: “Note that this approach has nothing to do with the early and now discredited technique of ‘glottochronology’ in linguistics (75).

    Ugly.

  310. David Marjanović says

    They are reinventing glottochronology.
    Does their model contribute anything to glottochronology?

    I understand that they deduce the phylogeny, but the claim posted here was about dating. And their dating method (which takes the phylogeny and accumulated innovations as the input) seems to be glottochronology.

    Glottochronology was invented in the 1950s. It didn’t catch on except to some extent in the Moscow School and never really took advantage of computers, which is one reason why it didn’t catch on: it made some glaring oversimplifying assumptions without which the computations wouldn’t have been feasible.

    Molecular dating was invented in 1962. It took advantage of computers as soon as the computers were able to handle it. One simplifying assumption after another was abandoned as computer speed increased in the 1990s and later. By now, the phylogeny and the divergence dates are often even calculated in a single step, not two; the new paper is an example of this – it conducted tip-dated phylogenetic analyses. That’s the successor of “stratocladistics”, the idea of taking the ages of fossils into account in phylogenetic analyses so the calculated trees fit the ages in the fossil record as well as possible.

    The two remained in blissful ignorance of each other until 2012 when Gray & Atkinson got into Nature. The poor linguists had no frame of reference other than the half-forgotten clumsy algorithms of “grottoclonology”, so they barely even looked into why Gray & Atkinson got such strange results and figured they could just ignore them. And now they wince every time another such paper gets published.

    So let me play Grand Interdisciplinary Phylogeneticist for a while:

    But using a large number of states of the cognates instead of mere presence /absence results in considerably younger dates, especially for the splits which are merely centuries old. It’s probably tied to the situation where minor variations in the cognate forms are very well known for the recently diverged languages, resulting in the inflated estimate of the rate of change. My question is, why it isn’t equally inflated in the recent vs. ancient languages? Inflated compared to the past, apparently, because in the historical data, variability / polymorphism of the cognate forms is less known? Or the real reason is that the variability on the recent timescale is sort of like ripples on the water, now here, now back there, and it gets smoothed out in the longer term evolution of languages?

    Nothing like that. I’ll quote the whole sections this time:

    7.10.6 UNDERESTIMATING TIME-DEPTHS IN CASES OF POLYMORPHISM AT AN ANCESTRAL STAGE

    As noted in §7.9, although IE-CoR followed strict protocols to keep the proportion of synonyms encoded in the actual data languages to within strict limits, in its output trees the covarion model infers a somewhat higher frequency of multiple cognate sets per meaning at the Proto-Indo-European root — although SA9 shows that the effect on root age estimates is minimal. The multistate model, meanwhile, does not allow any cases of multiple states at all (indeed for this analysis, all the limited cases of synonymy in the IE-CoR data languages had to be converted to ambiguity). So in those cases where more than one cognate set (in the same meaning) would indeed normally be inferred at some ancestral stage, the multistate model “overcorrects”.

    As an illustration, consider the meaning ANT (https://iecor.clld.org/parameters/ant) across the West Germanic languages. English ant, Frisian eamelder and German Ameise, for instance, are all cognates derived from a reconstructed Proto-West-Germanic *amaitjō- (IE-CoR cognate set 5281 [link], the ‘ant’ set). English, Frisian and German do not form a common sub-branch within West Germanic, however, so this pattern would generally lead to this cognate set being inferred as present in the most recent common ancestor (MRCA) of the West Germanic clade.

    In Dutch and Flemish, meanwhile, the word for ANT is mier, derived instead from Proto-Indo-European *moru̯i- (5007 [link]). Although this is not the cognate set used for ANT in other branches of West Germanic, it is widely found across North Germanic, and indeed in most other deep branches of Indo-European (Latin formica, Ancient Greek mýrmēx, and so on). This data pattern would also generally lead to this cognate set too being inferred as present in the MRCA of the Germanic clade, and inherited into West Germanic.

    The multistate model only allows a single state to be present at any one time, however, so to explain this situation it requires one or other of the following scenarios:
    • The ‘mier’ cognate set is present at the West Germanic root and continues into Dutch and Flemish, but there are multiple mutations from the ‘mier’ to the ‘ant’ cognate set, independently in the other branches within West Germanic (since German, English and Frisian do not form a common sub-branch).
    • Alternatively, ‘mier’ is present at the Germanic root, then a first mutation from the ‘mier’ to the ‘ant’ cognate set arises into Proto-West-Germanic, followed by a further mutation from ‘ant’ (back) to ‘mier’ into the Dutch/Flemish branch.

    (Note for comparison that if all five languages had only the ‘ant’ cognate set, then only a single mutation would have been required in the same time frame.)

    So where (even transitional) stages of synonymy did exist, the limitation in the multistate model that only a single state can be present at any one time leads this model to infer an increased number of mutations, and thus a faster clock rate. This is also because — unlike in the covarion model that allows fast and slow mutation in the same site — in the multistate model there is no other way to absorb the effect on the clock rate of these extra mutations where stages of synonymy existed. Increasing the clock rate contributes to the multistate model reducing root and internal node age estimates, which in turn explains why this model returns a younger root age, and underestimates the time-depths of splits in known historical cases.

    7.10.7 MISSPECIFICATION IN HANDLING SINGLETON INNOVATIONS

    To further investigate why the multistate model returns shorter terminal branches, particularly to non-modern languages, we identified all cognate sets returned with >0.5 probability of being innovations along the terminal branch to each non-modern language, under each model.

    The covarion and multistate models differ slightly in how many innovations they return along terminal branches to modern languages overall: 517 in the covarion analysis, 481 in the multistate. This is part of the explanation for the shorter branch lengths and thus younger dates in the multistate analysis, as explained in 6.10.3 above. A more important driver of this effect, however, is that the models differ very significantly also in which cognate sets they return as innovations — and in ways that reveal a model misspecification in the multistate analysis.

    The most obvious candidate for an innovation along a terminal branch is any cognate set that is a ‘singleton’, i.e. present in only one language taxon, e.g. English kill or dog, which are not cognate with the primary words for those meanings in any other language taxon in IE-CoR, Importantly, ‘present’ here means not whether a cognate set is just present in the vocabulary at all, but strictly as the primary word in the precise target meaning as tightly defined in IE-CoR. The hound cognate set is thus not ‘present’ in English as the data for the IE-CoR meaning DOG, because it is not the primary word for that meaning, which is dog instead.

    The covarion analysis performs as expected: the vast majority of cognate sets that it returns as innovations on terminal branches are indeed singletons in that one language. The few other cases are also expected: innovations in parallel with other taxa that are not its closest sister languages, indeed often very distant ones. The covarion analysis correctly returns as innovations, for example: 6 singleton cognate sets in Old Irish, 5 in Gaulish, 8 in Early Vedic, and 6 in Avestan (all in the strict sense of a singleton outlined above).

    In the multistate analysis, however, many singletons are not returned as innovations on that terminal branch at all. In the four languages just listed above, the multistate analysis returns as innovations only 1, 1, 1 and 2 respectively. This entails that the multistate model returns all 20 other singletons in these languages as also present in the most closely related parallel branch, but then lost along that branch before it began to diverge. The effect is to artifactually shorten the branch to the ancient language, and correspondingly delay the divergence of its closest sister languages. The net results on the chronology estimation are those already noted: taxa diverge far later than historical evidence shows, and time-depths are underestimated overall.

    (I still don’t know why they call the binary-characters-only analysis a “covarion” analysis. That’s a technical term that exists, but I don’t think that’s what it means.)

    For a purely phylogenetic analysis, it shouldn’t matter whether ancestors can be reconstructed as being polymorphic (e.g. Proto-West-Germanic as using the ancestors of both mier and ant, both inherited straight from PIE, as synonyms) or only as partially uncertain. But when the topology of the tree and the rates of evolution along each branch are calculated together and recursively influence each other, as is the case here, then suddenly it matters. Being unable to reconstruct polymorphism for ancestors, the multistate add-on used in this paper is forced to calculate a larger number of changes along many branches. Because some of the ancestors have their ages constrained (within a range), these larger numbers need to be squeezed into about the same amount of time. That means higher rates of evolution are calculated. And that means the ancestors whose ages are not constrained will be younger, because the amount of change between them and their attested descendants is not much higher than in the binary-only analysis but is expected to have happened faster.

    The flip side of the question is, why does the binary presence-of-a-cognate lead to a deflated estimate in the recent languages as compared with the ancient ones. Is it because retention of cognates is overreported in the extant languages?

    No, it’s because it makes all branches longer. Instead of counting a change from ‘mier’ to ‘ant’ as the primary word for ANT as one step (the character ANT changes states from ‘mier’ to ‘ant’), it counts it as two: loss of ‘mier’ (the character “‘mier’ for ANT” changes states from “present” to “absent”) and gain of ‘ant’ (the character “‘ant’ for ANT” changes states from “absent” to “present”).

    This is what made the branches in the original Gray & Atkinson paper too long. Chang et al. (2015) got this effect under control by constraining the ages of certain ancestors and thus making all branches shorter: they masked the problem instead of addressing it. Now the constraints are relaxed, the mask has slipped, and the original problem shines through.

  311. I think I actually understood a fair amount of that — thanks!

  312. ktschwarz says

    The ‘mier’ set left a relic in English in pismire. OED thinks this mire was inherited directly from Germanic, while the obsolete variant maur or mour came in from Scandinavian.

    AHD in the first (1969) edition linked pismire, as well as myrmeco- and formic, to PIE *morwi- in the Indo-European Roots Appendix. But that root was deleted in the 1992 and subsequent editions. Anyone know why?

  313. David Marjanović says

    Many thanks to Dmitry Pruss for the paper.

    Among the authors, I overlooked at least Matthew Scarborough, Roland Pooth and Tijmen Pronk among the professional historical linguists, and, importantly, Benedict King, who is a paleontologist and has published on missing data as an unexpected source of error in Bayesian phylogenetic analyses, though now he’s at the Department of Linguistic and Cultural Evolution of the Max Planck Institute for Evolutionary Anthropology in Leipzig, so he should be the Grand Interdisciplinary Phylogeneticist…

    Potentially important quote:

    Ancestry constraints used in previous analyses produced lineage split dates far too recent to be compatible with known histories: no divergence among West Norse languages until 1650 CE, none in Romance until 1000 CE, and none in Indic until 100 CE (12). These artifacts disappear from the ancestry-enabled analysis in Fig. 2. Icelandic and Faroese, for example, are now dated as splitting from the mainland Scandinavian lineages ~830 CE (470 to 950 CE), closely in line with the first Norse settlement of the Faroes and Iceland in the ninth century. Initial divergence within Romance is accurately dated to the Roman Empire in the first centuries CE. Divergence within Indic is dated to ~4370 yr B.P. (3640 to 5250 yr B.P.), in line with Vedic Sanskrit already being slightly divergent fromthe lineage(s) ancestral to modern spoken Indic languages (30). The inference of an Indo-Iranic split at ~5520 yr B.P. (4540 to 6800 yr B.P.) may, at first glance, seem surprising. Established expectations are for a more recent date, based on the perceived level of similarity between Vedic Sanskrit and Avestan—the earliest known ancient languages in the Indic and Iranic branches, respectively. However, these judgments of linguistic similarity have been largely impressionistic (36) rather than quantified. In the precisely defined IE-CoR meanings, Early Vedic and Younger Avestan share only 58.7% cognacy (37). This matches the level of cognacy that survives between the most divergent sublineages within the Romance clade, for instance, after roughly two millennia since the spread of the Roman Empire. Early Vedic and Younger Avestan themselves date back to at least the mid-fourth and mid-third millennia before present, respectively. A time depth two millennia earlier (~5520 yr B.P.) for the split between their lineages (Indic versus Iranic) is thus consistent with the 58.7% cognacy overlap between them.

    This last one is precisely where lexical replacement by early contact comes in: Indic has more local loans than Iranic, and another layer is shared by both. If any of the affected cognate sets were scored as absent instead of unknown, the Indic branch and the Indo-Iranian branch are too long. And if the program didn’t cope with missing data correctly, the same could have happened. (The mentioned Ben King has published on the surprising impact of missing data on undated Bayesian phylogenetic analysis: it downweights characters with missing data and thereby sometimes gets the topology wrong. I don’t know what the effect on branch lengths is.)

    The support for Greco-Armenian (posterior probability 0.86) is interesting in the light of this paper by one of the authors – though note that it only claims the evidence for Greco-Armenian is weak, not that the evidence for any alternative is stronger than that.

    It’s interesting that the Balto-Slavic + West IE clade, excluding Indo-Iranian and Albanian, has a posterior probability of only 0.63. For molecular data that would be considered borderline negligible.

    The considerable number of Pre-Indo-Iranian loans in Proto-Uralic is nowhere mentioned in the main paper; I haven’t read enough of the supplementary material yet.

  314. ktschwarz says

    As the flip side of pismire, the ‘ant’ set has some survivors in regional Dutch, according to the OED’s (2012) etymology of ant: “Cognate with Middle Dutch amete, eemt (Dutch regional emt, empt, emte, empe)”, then the German cognates.

  315. But that root was deleted in the 1992 and subsequent editions. Anyone know why?

    Xerîb will know if anyone does.

  316. John Cowan says

    It’s interesting that the Balto-Slavic + West IE clade, excluding Indo-Iranian and Albanian, has a posterior probability of only 0.63.

    It would be interesting to know if it gets better if Germanic is taken out.

  317. David Marjanović says

    Germanic is in what I’m calling the West IE clade, like Celtic and Italic, so the support for an exclusive Balto-Slavic + West IE clade is not likely to go up if Germanic is removed from the dataset.

    I think this clade is more likely long-branch attraction: Indo-Iranian and, even more so, Albanian have replaced so much vocabulary they end up attracted to the root of the tree. I’m reminded of all those early (i.e. late 1990s) molecular phylogenies where mice and rats were pulled out of Rodentia and ended up outside a clade that contained all placentals except them. Actual Nature headline: “The guinea pig is not a rodent” – mice and guinea pigs were the only sampled rodents in that paper. No, mice just evolve faster, and the clock in that paper was too strict, that’s all.

    Anyway, the abovementioned Euphratic didn’t make it into the main text (haven’t checked the supplement yet), but it’s in reference 55, which is here in open access. (In Russian, with a long English summary at the very end.) BTW, the entire journal seems to be in open access.

  318. David Marjanović says

    Euphratic isn’t in the supplement either. The loans in Proto-Uralic are, but in the vaguest way possible (2.2.2): “Proto-Uralic is variously seen as borrowing from — and thus contemporaneous with — Proto-Indo-European, Proto-Indo-Iranic or Proto-Iranic. Majority view now inclines towards the later of those stages, but this does not provide exclusive support for the Steppe hypothesis. Alternative hypotheses, including the hybrid hypothesis set out in Fig. 1D in the main paper, also entail early Indo-Iranic or early Iranic having already spread northwards into Central Asia at time-depths compatible with the range of possibilities for Proto-Uralic.” Should have cited Sampsa Holopainen’s thesis.

    Anyway, I see the whole paper as confirming the importance of treating Proto-Indo-Anatolian, Proto-Indo-Tocharian and Proto-Indo-Actually-European as three different things.

  319. ktschwarz says

    Piotr Gąsiorowski discussed the development of ant and mire here a few years ago, including how ant lost a syllable that emmet kept.

  320. David Marjanović says

    Section 1.3 explains, at some length and exasperation, why Bayesian tip-dating phylogenetics “is not lexicostatistics, and it is not glottochronology” and “has nothing to do with those outdated and widely discredited techniques”! An important difference I’ve never mentioned is that it’s a phylogenetic method, not a phenetic one: it does not use overall similarity to build its trees. Every historical linguist should read that section.

    1.1 and 1.4 explain, at greater length and quite convincingly, why the new dataset is much better than the previous ones (up to and including that of Chang et al. 2015). This is expanded upon in 3.4–3.6, which explain how the cognate sets work.

    Section 2 is about the aDNA. The first paragraph of 2.1.4 is worth quoting (bold and italics in the original):

    Recently published aDNA findings in Europe significantly update and refine the first major results (16). Notably, populations of the Yamnaya culture on the Pontic-Caspian Steppe no longer appear as the whole or even main source of the new ancestry component that spread into Central Europe from c. 5000 BP, from further to the east. The ancestry predominant in aDNA samples from early Corded Ware contexts has recently been reported (39) to have originated not on the Pontic-Caspian grassland steppe, but further to the north in the ‘forest steppe’ of the Middle Dnepr region, and towards the Baltic (85). The ancestry predominant in Yamnaya samples entered Central Europe via the Danubian corridor in a separate expansion further to the south. This conclusion has been challenged in (23), arguing that the Yamnaya culture on the Pontic-Caspian cannot be excluded as a primary source for Corded Ware samples, within the limits of the statistical resolution of their analysis. That inference, however, is a function of the low resolution of their chosen analysis, and of the fact that both Yamnaya and Corded Ware ultimately derive from a very similar Eneolithic ancestry background that formed c. 7000 BP once CHG ancestry spread north across the Caucasus.

    2.1.5 is called “Ancient DNA and the Indo-Iranic Branch”. It’s long and strikes me as waffling; it tries to doubt everything, including the identification of Andronovo, Sintashta and several other archeological cultures as Indo-Iranian. Interestingly, the hypothesis that the arrival of Indo-Iranian is associated with the destruction of the Bactria-Margiana Archeological Complex, as opposed to its prosperous existence before that, seems to be unknown to the authors. In the end they call for more aDNA.

    3.7 explains how the 170 reference meanings were chosen (they’re not simply copied from a Swadesh or other such list). 3.8 goes into detail about identification and treatment of loans; quotable paragraph:

    A first step to try to limit any potential confounds is that the set of reference meanings for a cognate data-set can be optimized to meanings that are typically as resistant as possible to loanwords. IE-CoR implements this optimization (see §2.7), resulting in the particular selection of 170 reference meanings. Nonetheless, in particular historical contexts of intense language contact that affected a number of IE-CoR languages, loanwords are found in up to a few dozen of these 170 meanings. English itself has loanwords in 25 of the IE-CoR meanings. Languages whose histories exposed them to even more loanwords include Albanian, and a large set of languages in the Indic and Iranic branches that for centuries were in extended contact with Persian as a dominant, high-status language.

    3.8.3 states that when the primary word for a meaning is a unique loan, all cognate sets for that meaning were scored as “absent” in the main analysis instead of “unknown” (“inapplicable”, which is the same thing in all available software). In sensitivity analysis 2, these cases were instead coded as single-member cognate sets of their own. Both approaches are basic errors that phylogeneticists in biology wrote papers about in the early 1990s, i.e. about as early as possible. It is bound to artificially increase the branch lengths of… oh, let’s say… Albanian, Armenian, Indo-Iranian, and maybe Tocharian as well: exactly the branches that are under suspicion of long-branch attraction.

    This, BTW, is yet another case where linguistic data have more in common with morphological than with molecular data in biology.

    From 4.11:

    The Propontis form of Tsakonian Greek ceased to be spoken only in the twentieth century (142). The IE-CoR calibration is therefore set to just 40 BP, with a standard deviation of 10 years.

    That’s what put the a in İstanbul! Note that, as the main paper makes explicit, “BP” is consistently incorrectly used for “B2K”.

    The last 2 of the 3 paragraphs of 5.3 (“Estimating Chronology”) read like an angry reply to a reviewer:

    The chronological estimation is based on these two parameters c and σ, and the set of known date calibrations of all languages in the data set (§4). Note that this approach has nothing to do with the early and now discredited technique of ‘glottochronology’ in linguistics (75). That was founded on an assumed constant rate r of change/retention in cognacy. In the relaxed clock model used here, there is nothing — no parameter setting — that corresponds to any ‘glottochronological constant’, certainly not the lineage ‘birth rate’ and ‘death rate’ parameters mentioned in §5.2 above. Those refer to the birth and death of lineages, i.e. splits and extinctions respectively, in the phylogeny. They do not refer to changes or ‘mutations’ in the state of a data character switching between being absent (0) and present (1) along branches in the tree.

    This distinction is best understood also by clarifying the relationship between branching events in the phylogeny, and change events in language data. Strictly, branching events occur independently of, and are not defined by, actual changes in the language data. Once a split has arisen, however, the conditions are set in which different changes can then arise along the different branches. Similarly in actual language histories, a speaker population may split into two (by some long-distance migration, for example), but that need not automatically create language changes. It does, however, establish newly separated speaker populations whose language lineages may thereafter undergo different changes, and thus progressively diverge from each other.

    5.5 makes clear that the authors are well aware that BEAST can use multistate characters, and that they made use of this: “SA10 started out from the main IE-CoR nexus export file (with each cognate set expressed as a binary presence/absence character), but used ad hoc code to convert this to the equivalent multistate format (with each meaning as a multistate character, and cognate sets as the various states of that character). Details of the process are included in section 7.10.1.”

    Fig. S6.1 (p. 57) is a representation of the main results as a single tree. Section 6.1 explains why the messy representation by DensiTree in the main paper is preferable. Still, the single Maximum Clade Credibility tree shows a few things that otherwise disappear in the fog. For example, the main text and one or two sections in the supplement (8.2 notably) explain, and I agree, why Sanskrit, even Rigvedic, is not Proto-Indic and why Strictly Classical Latin isn’t Proto-Romance or a direct ancestor of it. Fair enough, but the branch between Classical Latin and its last common ancestor with Proto-Romance is 600 years long in the MCC tree, and that between Rigvedic and Proto-Indic is 800 years long. Aren’t these a bit much? Compare the much shorter branches for Mycenaean, Ancient (Attic) and New Testament Greek or Classical Armenian.

    Speaking of Armenian, the split between Eastern and Western Armenian comes out as just 400 years old. Isn’t that too short?

  321. David Marjanović : There are Kümmel and Kim among the authors […]

    That’s some major serendipity.

  322. “Note that this approach has nothing to do …” – they are discrediting Swadesh’s work.

    By “this approach” they mean “relaxed clock”. The first link in google for “relaxed clock”: link. The first two lines in the Introduction are much more respectful to predecessors in biology (“the basis of clock-model-based phylogenetics”).

  323. Glottochronology has not been “widely discredited”, or correctly evaluated at all, for all I know. Objectors to glottochronology quote Hymes’s critical paper, directly or indirectly. Every one of Hymes’s criticisms (as I recall, I haven’t looked at the paper in a while) are due to that he didn’t understand statistics, or what error bars are. Glottochronological dates have two kinds of inherent error: counting statistics (same as the carbon dating which inspired it), and variation in lexical replacement rates. Of course lexical replacement sometimes happens fast and sometimes slow. But the distribution of these rates is probably universal, with similar means for, say, Austronesian and Bantu. That spread should be accounted to by putting error bars on inferred dates, which nobody did (except maybe Gudschinsky?) So what wonder is it if you read that two languages split on June 5th, 829 AD, and find that the date is 100 years off wrt known history, and hence glottochronology is bunk? Whereas, if the two languages are calculated to have split probably some time between 600 and 1000 AD, that is less exciting, but correct.

  324. DM, does the paper list individual authors’ contributions? From the supplement, I gather that the professional historical linguists were the one who assembled the word lists, and evaluated cognacy. Did any participate in the theoretical part as well?

  325. Trond Engen says

    @David M.: Thanks for taking us through.

    I think the main problem with these papers are that they’re sold on the wrong premise. They shouldn’t (yet) be understood as attempts to use bayesian phylogenetics to understand the development of Indo-European but rather to use the fairly well understood history of Indo-European to develop bayesian phylogenetics for historical linguistics.

  326. I’m tired of papers that intimidate linguists with mathematical terms.

    Linguists can only (1) trust (2) mistrust (3) study math. Possibly linguists should actually study mathematics, but:

    It took advantage of computers as soon as the computers were able to handle it. One simplifying assumption after another was abandoned as computer speed increased in the 1990s and later” (DM)

    I think I can reproduce “relaxed clock” with pen and paper:-/

    Could someone correct me if that’s wrong?

  327. Strictly Classical Latin isn’t Proto-Romance or a direct ancestor of it. Fair enough, but the branch between Classical Latin and its last common ancestor with Proto-Romance is 600 years long in the MCC tree,

    Thank you @DM for the summary. Are you able to unpack that bit?

    Where and when was the ancestor of Proto-Romance being spoken? In particular, where was it when Classical Latin was the language of Empire?

    And thanks @Y, I appreciate any dating will have error bars.

  328. Consider a simple tree of three languages a, b, c (a and b are closer to each other) and you know all history but dates. You know how many innovations what language has compared to what.

    All “dating” that you can do:

    step 1: determining the age of P[roto]-ab relative to P-abc.
    step 2: either

    – using a known date of one protolanguage to infer the absolute date of the other (IE case)
    or
    – matching some cross-linnguistical average to your tree’s average.

    The latter would be Swadesh’s (discredited) case if Swadesh did not say in the (discredited) publication they reference as discredited that we “can’t assume” that English changes equally fast. He said that “very few scholars have any conception whatever of the degrees of relationship or the stretch of time needed” and that “It is the aim of the present paper to provide some approximate answer” – but then he said this “can’t assume” and limited himself to relative dates.

    So, relative ages. The most straightforward thing you can do here is just averaging.

    We don’t know the age of P-ab – let it be T-ab and we don’t know the age of P-abc – let it be T-abc.
    But we know that a innovated n forms and b innovated m forms over T-ab, while P-ab innovated k forms between T-abc and T-ab.

    So we can assume that P-ab was innovating (n+m)/2 forms over every T-ab, and thus determine how large T-abc can be relative to T-ab.

    A schoolgirl can do that.

    I think this method can be extended to complex trees without too many calculations (correct me if I’m wrong). Now is not this “relaxed clock” (just without probability distributions and confidence intervals)?

  329. Of course if I know dates for several langauges (in a more complex tree) I will find that my tree is distorted – its proportions don’t match relative known dates!

    What it means is that there is not a straighforward way to use these known dates for calibration, all at once.

    One way to do that would be resorting to those probability distributions which will require more calculations (still I think doable with paper). But those distributions are actually based on nothing.

  330. John Cowan says

    Strictly, branching events occur independently of, and are not defined by, actual changes in the language data.

    Perhaps I misunderstand the terminology, but this seems staggeringly metaphysical. All L1 anglophones today distinguish between KIT and FLEECE vowels, though the phonetic nature of the distinction varies. Suppose that a few hundred years from now, some anglophones have merged them whereas others have not. I would say that the populations split sometime between now and then, as seems only commonsense. But our authors are seemingly committed to the idea that the split is in effect right now in 2023, and for that matter may have been in effect already in 1873, so that we are a century and a half past the branching point despite the fact that nothing happened. Then again it may not happen until 2073. So it is not possible, however closely we examine speakers today, to figure out whether they have split or not; there is literally no evidence one way or the other.

    Once a split has arisen, however, the conditions are set in which different changes can then arise along the different branches.

    Can then arise. So perhaps it may be the case that our presently split population will never materialize into an actual division between merged and unmerged speakers?

    Similarly in actual language histories, a speaker population may split into two (by some long-distance migration, for example), but that need not automatically create language changes. It does, however, establish newly separated speaker populations whose language lineages may thereafter undergo different changes, and thus progressively diverge from each other.

    That’s empirically right, but privileges mere geographical division over linguistic division as the One True Meaning of division. Raymond Brown (the conlanger and Eteocretanist) once referred to until the cows come home as a characteristically BrE expression, and I pointed out that it also existed in AmE by saying that the Sundering Sea had not sundered our metaphors. In other words, seen through the lens of this expression, there has been no split at all. But our authors seem to want to say that in fact there has been, it just hasn’t shown up yet.

  331. Note that it is not obvious that one method of calibrating (known dates within the family or cross-linguistical average) is inherently better than the other.

    The former seems good when you know several dates within the family.

    Otherwise horrible things may happen. If there is a limit on how innovative a language can be, and a and b happen to be VERY innovative, T-ab is small, P-ub is conservative, c is reasonably innovative – our method will (wrongly) assume that P-ub is also VERY innovative.

    And c which is much more innovative will become пиздец как innovative, well beyond the limit of possible, while T-abc will be small.

    I’d say that cross-linguistical average is informative when there is wild variation within the family, one date within the family is informative when there is little variation within the family, both are good when there is little variation in family and globally, and many dates within the family are always good. Cross-linguistical studies could possibly tell, what of these is the case, but:

    You simpy can NOT use ths method of calibration for Cushitic. You only have several known dates for IE and Semitic.

  332. Dmitry Pruss says

    Thank you very much, DM. Really helpful. Please pardon me for trying to simplify again.

    So it looks like everything would have worked great if the selected cognates turned up in all languages. But it’s impossible in practice. But they still tried selecting the most universal set of cognates, with impressive results. The loss of cognacy is relatively small but unevenly distributed, with the known outliers such as Albanian in Europe, Iranian and especially Indic.

    In a model with multiple states, when a replacement happens, it counts not as one change but as many changes as there were characters, as if the word was replaced by systematically mutating each character, rather than in one act of horizontal transfer. For the reasons which I don’t understand, such would-be ultramutation events are counted towards the estimated change rate, instead of being discarded as outliers. And voila, the change rate estimate becomes unrealistically large. As a result, calculated dates become too young in the branches with less borrowing in the selected cognate set. That’s clear.

    With the 1/0 presence or absence framework, replacing cognates creates an opposite bias in dates estimates, but it’s less clear why. Counting replacement events as 2 (1 loss and one gain) is a way to overestimate the numbers of changes, and may inflate the ages. But what is different between counting a replacement as several changes instead of one, which, as we just discussed, makes ages younger, and counting it as 2 which makes it older?? And how does coding the possible replacement as missing info achieve the same?

  333. Trond Engen says

    Do they use phonological changes in the lexical set or only replacement rates?

    Y: Glottochronological dates have two kinds of inherent error: counting statistics (same as the carbon dating which inspired it), and variation in lexical replacement rates. Of course lexical replacement sometimes happens fast and sometimes slow. But the distribution of these rates is probably universal, with similar means for, say, Austronesian and Bantu.

    I think this is the major flaw of glottochronology. Carbon dating is using a nuclear process that is happening slowly at a known average rate largely independent of the physical and chemical environment, and the unpredictability of single atoms is evened out on the large total number of carbon molecules. Lexical replacement is far from independent of changes in the physical, cultural and linguistic environment, but will be a direct result of it, so it’s reason to believe that it comes in waves. Since no two languages will ever be in the same environment, calculated rates can’t be transferred. Add to that that the lexical set used is small — by design, even.

    I’m more optimistic about statistical phylogenetics, but it would need to use phonological and morphological markers as characters. That’s not to say I’m sold on it. The tree model is good from a distance and with stylished sets, but it’s not meant for the kind of messy continuum that is characteristic of actual language communities. When biologists work with hundreds or thousands of species over maybe millions of generations, that doesn’t matter much, but linguists work with a handful of languages over at best a couple of thousand generations. Some split events probably happen abruptly, for others the time between the beginning and the end may be 10-50% of the age of the family. I have no idea what that means or how to account for it, but I know I wouldn’t try to use it to build a tree of dialects of German.

  334. Dmitry Pruss says

    Carbon dating uses an invariable decay rate, but still suffers from two problems it has in common with the comparative language dating. The starting state isn’t known, and admixtures influence the results.

  335. Trond Engen says

    True.

    (I considered mentioning the reservoir effect, but dropped it since I couldn’t think of a linguistic parallel. Maybe exposure to literature and oral recitation,)

  336. Dmitry Pruss says

    The Ephratean paper, understandably, deals with Proto-Kartvelian interactions, and, to a lesser extent, with potential early IE borrowings in Semitic and with the Mitanni corpus. I believe that all of these interinfluences belong to a much later age than the hypothesized urheimat somewhere in today’s Kurdistan? Are there any “really early” suspected borrowings between nascent IE and its presumed Middle Eastern neighbors?

    The paper also has totally cringeworthy genetic tidbits, like they always describe any genomic data as a “haplogram”, a collection of mtDNA or Y-haplotypes :/

  337. David Marjanović says

    they are discrediting Swadesh’s work

    They’re simply saying it doesn’t work – it’s of purely historical interest – and they’re doing so in a passage that reads like an exasperated reply to a nasty reviewer.

    This does mean that there was a linguist among the reviewers. 🙂

    does the paper list individual authors’ contributions?

    Yes; I didn’t try to decipher the list, though. Here it is:

    R.D.G. initiated and coordinated the study. P.H. and C.A. designed the IE-CoR database and data collection methodology and coordinated the linguistic coding team. M.Sc. oversaw all determination of cognacy at the deep Indo-European level. C.A., M.Sc., L.J., M.J.K., T.J., B.I., R.P., H.L., R.F.S., G.H., M.M., R.I.K., E.A., T.P., O.B., T.K.D.-F., M.B., C.F., R.T., M.Se., N.L., K.St., K.Sc., and G.K.G. were major contributors to the 25,918 lexeme and 5013 cognate determinations in the IE-CoR database. R.B., B.K., S.J.G., and D.K. conducted the phylogenetic analyses, with input from R.D.G., Q.D.A., P.H., and C.A. W.H. and J.K. advised on the aDNA data. P.H., R.D.G., D.K., B.K., and C.A. wrote the text. All authors commented on the manuscript.

    I think the main problem with these papers are that they’re sold on the wrong premise. They shouldn’t (yet) be understood as attempts to use bayesian phylogenetics to understand the development of Indo-European but rather to use the fairly well understood history of Indo-European to develop bayesian phylogenetics for historical linguistics.

    The early history of IE is not well enough understood for this: neither most of the relationships between the main branches nor the absolute chronology. That’s where this paper tries to help.

    If the problems I highlighted were fixed, maybe it actually could help, certainly to some extent. But I’d really like to see more conventional comparative-historical research on early IE phylogeny. It’s obviously a difficult problem, but I think there’s some circularity going on: people have been treating it as unsolvable and therefore not worth studying, or at least not a good PhD topic for a student who wants to finish in time; therefore it hasn’t been studied; therefore there’s no progress toward a solution; therefore the problem looks unsolvable. And that would still leave the chronology – though once you have a robust tree you can at least try to match it to archeology or aDNA and see if you can rule anything out.

    I’m tired of papers that intimidate linguists with mathematical terms.

    You can’t do science without math. Math is a description of, and generalization over, how physical objects behave. Scritta in lingua matematica and all that.

    I think I can reproduce “relaxed clock” with pen and paper:-/

    With a million pens and a billion sheets of paper and a billion years.

    Seriously, we’re talking about calculations that take a desktop weeks or months.

    Where and when was the ancestor of Proto-Romance being spoken? In particular, where was it when Classical Latin was the language of Empire?

    Right there in Rome: “Vulgar Latin” as opposed to the written/oratorial/… register. The paper takes pains to make clear that a split in the trees they get is not when languages become mutually unintelligible or anything like that, it’s when, or even a bit before, one difference appears in their which-word-for-which-meaning dataset.

    So far, so good. But their “Classical Latin” is defined as Caesar-and-Cicero as usual, and placed at 50 BC. This means that “Classical Latin” and “Vulgar Latin” must have separated – Latin must have developed the “Classical” register as far as the dataset is concerned – when Rome was still a kingdom, and both “Classical” and “Vulgar” must have gone through the whole “Old Latin” period in tandem, separately. I wonder if that isn’t stretching it.

    a known date of one protolanguage

    Not available. That’s why tip-dating was necessary.

    Except in some of their sensitivity analyses, Heggarty et al. did not assume that any language in their dataset is an ancestor of any other. They let the program sort this out (i.e. the program was free to find branches with zero length); they didn’t constrain it.

    The program found (Table S6.2) a posterior probability of 0.7191 that Mycenaean Greek was ancestral to all other forms of Greek in the dataset; 0.5006 that Classical Armenian was ancestral to modern Eastern + Western Armenian; 0.3878 that New Testament Greek was ancestral to all modern forms of Greek (they have quite a lot in their dataset); 0.3073 that “Ancient Greek” (mostly or entirely Classical Attic) was ancestral to New Testament + modern Greek; and 0.0024 that “Old English” (Early West Saxon!) was ancestral to Middle and modern English (i.e. mostly Anglian). The probabilities that any other language in their dataset is strictly direclty ancestral to any other is given as 0 in the table.

    That’s a bit extreme, actually. For instance, Old Welsh, Middle Welsh and Early Modern Slovene are among these zeroes. But given how strict the definition of “ancestral” used here is – the slightest dialect or register difference that shows up in the dataset is enough for disqualification – it may not be absurd. Conversely, keep in mind that Mycenaean Greek has a lot of missing data in the dataset, so anything is hard to exclude.

    You know how many innovations what language has compared to what.

    Not in this case – the dating and the phylogeny are calculated in one step here.

    – matching some cross-linnguistical average to your tree’s average.

    No, that’s not how this method works. It uses enough math to avoid such averages.

    Perhaps I misunderstand the terminology, but this seems staggeringly metaphysical.

    No, mathematical. There’s a birth-death model in the method; one of the relaxed clocks is for how often lineages split and for how often they go extinct. The parameters are calculated from the dates supplied for each tip – and they, too, come with probability distributions, AFAIK.

    So it is not possible, however closely we examine speakers today, to figure out whether they have split or not; there is literally no evidence one way or the other.

    Correct.

    Can then arise. So perhaps it may be the case that our presently split population will never materialize into an actual division between merged and unmerged speakers?

    Yes, but that is astronomically unlikely because it would require an improbably low rate of evolution along both of these branches.

    In other words, seen through the lens of this expression, there has been no split at all. But our authors seem to want to say that in fact there has been, it just hasn’t shown up yet.

    The model may or may not infer one, with a certain probability, if fed with future data after evidence for the split has shown up.

    Note that it is not obvious that one method of calibrating (known dates within the family or cross-linguistical average) is inherently better than the other.

    The former seems good when you know several dates within the family.

    In this case, you know a date with an error margin for every language in the dataset. (Not all of these error margins are equally wide. Those for Early Vedic and Young Avestan, which were orally transmitted for a long time before they were written down and we don’t know the oldest manuscripts anyway, are uncertain enough that there’s a sensitivity analysis with no dates for these two. It doesn’t change much.)

    Where would you get a cross-linguistic average from, and where would you get its error margins, its probability distribution, from?

    I’d say that cross-linguistical average is informative when there is wild variation within the family

    If there’s wild variation, the calculated rates of evolution will be wildly wrong for almost every branch (the terminal and the internal branches of course – “branch” meaning “internode”).

    one date within the family is informative when there is little variation within the family

    But you don’t know beforehand if that’s the case.

    You simpy can NOT use ths method of calibration for Cushitic.

    Correct, because no Cushitic language was sufficiently documented before the 20th century.

    So it looks like everything would have worked great if the selected cognates turned up in all languages.

    No, because that would make all the languages identical for the purpose of this analysis. The characters are “cognate set x is the basic word for meaning y: present / absent”, except for sensitivity analysis 10, where they are “basic word for meaning y: cognate set x / w / v / u”. Great pains were taken to narrow each language down to one state per character; although this wasn’t quite achieved, the process is described in the supplement at some length. One of the given examples is why small is more basic in English than little.

    In a model with multiple states, when a replacement happens, it counts not as one change but as many changes as there were characters, as if the word was replaced by systematically mutating each character, rather than in one act of horizontal transfer. For the reasons which I don’t understand, such would-be ultramutation events are counted towards the estimated change rate, instead of being discarded as outliers.

    When all characters are “basic word for meaning y: cognate set x / w / v / u”, each replacement of one cognate set by another for the meaning in question is one change. That was how sensitivity analysis 10 was done. When all characters are “cognate set x is the basic word for meaning y: present / absent”, as in all other analyses, each such replacement is two changes: one character changes from “present” to “absent”, another from “absent” to “present”.

    If the pissmire is forgotten and ant becomes the basic word, sensitivity analysis 10 counts that as one change: the character “basic word for ANT” changes states from “a cognate of mire” to “a cognate of ant“. The other analyses counted it as two: the character “a cognate of mire is the basic word for ANT” changes states from “present” to “absent” (one change), and the unrelated character “a cognate of ant is the basic word for ANT” changes states from “absent” to “present” (the other change).

    This way, all branches become twice as long in terms of numbers of changes that need to be modeled. What branch lengths in terms of time that translates to depends on the rates of evolution; those are calculated from the numbers of changes and the time calibrations. By playing with the calibrations, you can get the program to calculate rates that are twice as fast; that’s what Chang et al. (2015) accomplished. But if the calibrations are the same, the program has no reason to infer faster rates, and so it was entirely predictable that sensitivity analysis 10 got shallower dates than the rest.

    Does that help?

  338. I think this is the major flaw of glottochronology.

    Of course it is. But the alternative is not to estimate dates at all.

    Lexical replacement is far from independent of changes in the physical, cultural and linguistic environment

    I intended to say that “relaxed clock” – a model based on an assumption that innovativeness of Romance is most likely to be similar to innovativeness of Latin – looks natural in the context of biology.
    Linguists first think about environment – contact etc – that triggers change.

    On the other hand, Semitic is in many ways conservative.

  339. Dmitry Pruss says

    Yes, thank you again, DM, you can see that I greatly misunderstood the multistate model.
    To try to recap, it should have as many value for each meaning as there are cognates in the entire set of languages. But each language may have only one value, as if there was only one way to express this meaning.
    The ANT example with two different cognates appearing haphazardly in Germanic languages seems to me to indicate that both words coexisted with similar meanings or in interconnected dialects, frequently occupying each other’s place without really going out of circulation. The multistate model would instead describe the situation as multiple changes in two opposite directions, ach time obliterating one of the words, and exaggerate the pace of change. The languages without such back-and-forth changes will look comparably younger, and I guess that’s most of the extant languages and all of the ancient ones?

    The binary model would have many variables for each meaning, as many as there are distinct basic words for it. In the same ANT example, it would probably count two changes any time the word changed in the back-and-forth chain, so it would also overestimate the rate of change? In exactly the same fashion in extant and ancient languages? But then it shouldn’t give different results than the multistate approach. Yet it does, so I suspect that these changes are counted differently in the new and old languages…

  340. David Marjanović says

    Do they use phonological changes in the lexical set or only replacement rates?

    Neither phonological nor morphological characters are used. The supplement laments that there are so few* and that they can’t be used for dating because, while vocabulary may generally change like molecular data do, morphological systems can stay pretty stable for millennia (Lithuanian is given as an example) and then suddenly collapse. Also (I haven’t checked if that’s mentioned), giving each of the lexical characters they actually used the same weight is more or less reasonable, because they’re all basic vocabulary and numerous; how to weight lexical, phonological and morphological characters relative to each other is anybody’s guess at this point.

    * I don’t think that’s right. The ones relevant to early IE phylogeny are definitely underresearched, though.

    Carbon dating uses an invariable decay rate, but still suffers from two problems it has in common with the comparative language dating. The starting state isn’t known, and admixtures influence the results.

    The starting date (when continuous exchange with the atmosphere/hydrosphere ended) is exactly what carbon dating calculates.

    The Ephratean paper

    That’s not one of the papers that proposed the Euphratic substrate in Sumerian; those are all in English, and we’ve discussed them before – I’ll try to look for that tomorrow.

    potential early IE borrowings in Semitic and with the Mitanni corpus. I believe that all of these interinfluences belong to a much later age than the hypothesized urheimat somewhere in today’s Kurdistan? Are there any “really early” suspected borrowings between nascent IE and its presumed Middle Eastern neighbors?

    That is the question.

    Wine has an IE-internal etymology (it’s a root cognate of wind), yet it’s all over Semitic.

    “7” is suspiciously similar in much of western Eurasia, and seems to have had religious significance in early Semitic-speaking cultures, so it’s plausible as a Wanderwort, though there’s no good proposal of how that worked exactly.

    I’m pretty sure Gamq’relidze & Ivanov postulated a bunch of others. But their work has largely been ignored, probably because they also proposed (at least in the 1980s version; haven’t checked the 2013 paper) a bunch of new phonemes for PIE.

  341. David Marjanović says

    The ANT example with two different cognates appearing haphazardly in Germanic languages seems to me to indicate that both words coexisted with similar meanings or in interconnected dialects, frequently occupying each other’s place without really going out of circulation. The multistate model would instead describe the situation as multiple changes in two opposite directions, ach time obliterating one of the words, and exaggerate the pace of change. The languages without such back-and-forth changes will look comparably younger, and I guess that’s most of the extant languages and all of the ancient ones?

    Yes. The trick is that the software used for the multistate dataset (sensitivity analysis 10) is unable to reconstruct ancestors as having had synonyms. That’s why it has to assume all these multiple back-and-forth changes. More changes per branch between modern languages, but more consistency among older languages, coupled with unchanged dates for each attested language, mean higher rates of evolution for the terminal branches, and that means they’re shorter in time, so the splits between them are pulled toward the present. The software used for the binary-characters-only dataset reconstructs polymorphic ancestors just fine and therefore doesn’t end up reconstructing far too young ages in messy families like West Germanic or Slavic.

  342. an inability to deal with the existence of synonyms seems like a disqualifying problem for any useful representation of language change to me (which could be my anglophony at work, since english is so absorbtive). but it sure feels like a way to introduce a whole layer of invisible errors, especially for extinct languages with small corpuses, and especially with a definition of language “split” as one not-necessarily-realized-or-documented change in one word.

    (i’m thinking about the incoherence that would result from applying this to the soda/pop/coke/tonic divide* in u.s. english’s “sweetened carbonated drink” term [a contemporary basic comparable to “beer”, “wine”, “soup”, and other such common generics], if the available sources were reduced to the quantity we have from a rarely-written ancient language. i think it’s sensible to assume that these kinds of variations are very common, and an approach that can’t handle them hardly seems useful.)

    .
    * as an example that roughly correlates with other differences among regional lects, but largely ones that are understood by speakers as synonym preferences within a shared language.

  343. >The ANT example with two different cognates appearing haphazardly in Germanic languages seems to me to indicate that both words coexisted with similar meanings or in interconnected dialects, frequently occupying each other’s place without really going out of circulation.

    This seems the core problem with setting a definite date of divergence. Languages/dialects/registers go on in conversation with each other, sometimes for decades. Hypothetically, if in 563 BC, languages a, b and c (Classical, Vulgar and a Third Latin which came to be spoken by the descendants of an island in the Atlantic…) diverged, they would get the same divergence date in this paper, though what I’ll call “Insular Atlantic Latin” speakers hadn’t taken books with them and never heard another word of the mother tongue, while Classical Latin and Vulgar Latin went on sharing changes, alternately emphasizing and downplaying synonyms for centuries, a process that only scaled down significantly when the geographic scope of the two languages changes as the Republic expanded, and perhaps could be said to end only when Vatican II radically diminished the likelihood that Church Latin might influence fads of diction in the Romance vernaculars.

  344. Dmitry Pruss says

    The software used for the binary-characters-only dataset reconstructs polymorphic ancestors just fine and therefore doesn’t end up reconstructing far too young ages in messy families like West Germanic or Slavic.

    I understand that the problems with falsely implied ancestral monomorphism disappears in the binary model. But I don’t yet understand why it is any better for the estimated rate of change. In the ANT example, it would have separate values for both ant-words, but wouldn’t they flip from 1 to zero and from 0 to 1 every time one of these words is replaced by the other? Leaving the reconstructed ancestor fully polymorphic (1,1) but the descendant languages just as falsely monomorphic as in the multistate model?

    I’m pretty sure Gamq’relidze & Ivanov postulated a bunch of others
    Scrolling through the paper I saw just one small section listing a bunch of those, which didn’t strike as convincing actually. But if there is a need, then please let me know, I can go back to the PDF and just copy and paste the relevant passages.

  345. … synonyms for ANT in Germanic languages …

    @DM The trick is that the software used for the multistate dataset (sensitivity analysis 10) is unable to reconstruct ancestors as having had synonyms. That’s why it has to assume all these multiple back-and-forth changes.

    So it’s mis-representing the amount of ‘churn’? And risks failing to follow cognates? What if one of the synonyms drops out of use in one/some languages, but the other synonym drops out of use in others?

    I take @rozele’s point that non-English languages (especially pre-widespread literacy) aren’t so generously burdened with synonyms, so I looked at Polynesian languages for such a common [**] meaning as big[***]/large.

    NUI

    LASI/Rahi

    seem to be stable and persistent across all branches of Polynesian.

    [**] ‘Big’ appears on all SWADESH lists AFAICT.

    [***] Which sidelined me into ‘big’ being actually not stable or persistent across Germanic. It appears out of nowhere in Northern English c.1300 — mysterious like ‘dog’. And ‘large’ ain’t Germanic either. “Old English used micel (see much) in many of the same senses.” The Germanic comparand is ‘Great’/’groß’ — “perhaps from PIE root *ghreu- “to rub, grind,” via the notion of “coarse grain,” then “coarse,” then “great;” but “the connextion is not free from difficulty” [OED]” — quoted in etymonline.

  346. micel is the synonym that actually goes back to PIE, it’s related to Greek megalo-, Latin magnus, etc. So it and great must have coexisted for a time, perhaps with some different nuances of meaning or some difference of register, as both are also attested in other Germanic languages. In Modern German, the cognate to micel only survives in place names, of which Mecklenburg (Low German form) is probably the best known.

  347. Trond Engen says

    ON mikill survives in Insular N.G. and as the adverb/mass noun modifier e.g. Sw. mycket, Da. meget. It’s also fossilized in placenames like No. Myklebust and Da, Maglemose.

    The unmarked Continental N:G. adjective for “big” is stor < PIE *sth₂-ró- “big, old, strong”, which certainly must have been a synonym for a long time.

    Yet another synonym with a long pedigree is ON digr “(extra) large, big in volume or extent”.

    The closest there is to a cognate of great in North Germanic is e.g. Nyn. graut, Da. grød “porridge”.

  348. the cognate to micel only survives in place names

    Same in English: Micklegate in York (‘gate’ = ‘gaße’ = street); Micklethwaite town near Bingley; Mickle Fell in the Pennines, highest point in Yorkshire (proper boundaries).

    Oh, and the (Scots?) dialect mumbo-jumbo “Many a mickle makes a muckle”.

  349. David Marjanović says

    I understand that the problems with falsely implied ancestral monomorphism disappears in the binary model.

    The model itself has nothing to do with it. The software used with the binary dataset happens to be able to reconstruct polymorphism, the add-on programmed in-house for the multistate dataset happens to be unable to do that. That’s all.

    As I said: with a flawed program (the add-on for the multistate dataset) they got a flawed tree (some nodes younger than known attestations of their descendants), so they used a flawed dataset (purely binary) instead. The tree they got is probably also flawed, but it isn’t known to be outright false, so they published it…

    So it’s mis-representing the amount of ‘churn’?

    Exactly. The add-on for the multistate dataset has to assume more churn along some lineages.

    In Modern German, the cognate to micel only survives in place names, of which Mecklenburg (Low German form) is probably the best known.

    I’m currently in hiking distance from a Micheldorf (which was until recent times the largest village in the valley). It is thought that this word is why the name Michael, formerly universal nickname Michel, became so popular – and I suspect that this popularity is actually what drove the word to extinction. (“‘Mike’? What’s ‘Mike’ about this? …And why does it look like a diminutive?”)

    stor < PIE *sth₂-ró-

    More like a twice- or thrice-derived *stah₂-u-ró-. Mahlow’s law: *au vowel clusters become Proto-Germanic *ō in underresearched environments, otherwise they turn into *aw.

    German stur “stubborn/pigheaded”.

  350. @Trond Yet another synonym with a long pedigree is ON digr

    Thanks Trond, etymonline (link above) conjectures “[‘big’] possibly from a Scandinavian source (compare Norwegian dialectal bugge “great man”). ”

    My highly limited Norw sources don’t even know the word. “From Old Norse buggi. “ says wiktionary. But of course that doesn’t prove transmission via longboat. ‘Northern English’ is at least the right place to appear, but there’s some 250 years’ gap to explain.

  351. it isn’t known to be outright false, so they published it…

    That’s an… interesting way to do scholarship.

  352. Dmitry Pruss says

    they used a flawed dataset (purely binary) instead
    The binary dataset correctly tracks synonyms as far as I can see. If both are present then the respective values are 1,1 and if only one is attested, it’s 1,0 or 0,1.
    What the dataset apparently fails to represent is the fact that both of these cognates correspond to the same meaning, and it probably leads to a substantial negative correlation between the respective binary values. That is, there will be more 1,0s and 0,1s then expected by chance? Perhaps it’s fraught with severe consequences but I don’t really see why it’s such a problem. Sure, due to this anticorrelation, a change of the word usage will be frequently misrepresented as TWO changes. But if it happens with similar frequency across all the languages, then isn’t it going to be calibrated away?
    Yet the most problematic result obtained under the binary model is the overestimated age of many splits. IMVHO for that to happen, the affected languages must possess unusually high numbers of replacements of one synonym by another, compared to the languages with known dates used for the calibration. Is it really the case? Can you please explain why?

    Intuitively, it seems to me that almost all changes in the binary model will be replacements of one synonym by another, and therefore, almost all changes will be counted as 2. Am I wrong? Are there any situations where a word for a given meaning just disappears without a replacement by anything else in the dataset? Or reappears out of nowhere? Which would count as 1 rather than two? Perhaps in the recent languages, simultaneous presence of multiple synonyms is much more frequently implied and that’s why the “1 changes” are disproportionately common there???

  353. ” with a flawed program (the add-on for the multistate dataset) they got a flawed tree”

    Likely it is a consequence of representation. The binary format let’s you encode synonyms:
    100100
    010100
    010001
    you can turn it in 110100.
    The multistate format for the same is
    AA
    BA
    BC

  354. Trond Engen says

    David M: More like a twice- or thrice-derived *stah₂-u-ró-.

    Quite possibly. Gmc. does at least show *steh₂-ro-.

    I’ve also thought of the semantic identification of St. Michael with mikill. The identification of Augustinus with the name Eysteinn is attested. Also forms of Severinus and Sigurðr seem to have merged.

    @AntC: ON buggi “rich/powerful man” does exist, but ME bigge would rather seem to be < ON **byggi, which I immediately would interpret as a formation to byggja “settle/live (with/in)”.

    I guess the suggested derivation might instead presuppose an unattested homonym **byggja v. “make swell” < **bug(r) “swollen, swelling”, parallel to hyggja “come to think of” < hugr “mind”. An indication that the parallel existed would be Mod. Da. bugne “swell, bulge”, Norw. bugne “abound” and ON hugna “be agreeable, like (dep.)”,

  355. Trond Engen says

    … and that’s a pretty weak indication.

  356. David Marjanović says

    That’s an… interesting way to do scholarship.

    Isn’t it.

    What the dataset apparently fails to represent is the fact that both of these cognates correspond to the same meaning

    Correct.

    Yet the most problematic result obtained under the binary model is the overestimated age of many splits. IMVHO for that to happen, the affected languages must possess unusually high numbers of replacements of one synonym by another, compared to the languages with known dates used for the calibration. Is it really the case?

    No. The potential source of error I can see is simply that every change is counted double, meaning most branches are too long in terms of number of changes. Given the same tip dates, that should exaggerate the dates of all splits systematically.

    If enough node dates are constrained, this problem is masked: that’s what Chang et al. (2015) effectively did. But here, the only calibration date that isn’t a tip date is a maximum age of 10,000 B2K for everything.

  357. Dmitry Pruss says

    here, the only calibration date that isn’t a tip date is a maximum age of 10,000 B2K for everything

    but some tip dates are millennia old, right? Often, it would be a very lopsided tree fork, which a short branch leading to an extinct language a longer one, to the extant ones. “Short” and “Long” in terms of time since the split node. Scaling the entire tree by using an unrealistic ratio of number of changes to the change rate will run into date constraints with such ancient tips. So my gut feeling is that simply ~doubling the number of changes between languages in a context where tip dates span a wide period will result in effectively ~halving the rate of change, to satisfy the date constraints at the tips. Additional constraints at the nodes wouldn’t even be necessary.

    And only if *most* of the branches suffer from the double-count of changes, but some of the more informative branches (like the ones with the tip dates wide apart) don’t, then we have a problem?

  358. David Marjanović says

    So my gut feeling is that simply ~doubling the number of changes between languages in a context where tip dates span a wide period will result in effectively ~halving the rate of change, to satisfy the date constraints at the tips. Additional constraints at the nodes wouldn’t even be necessary.

    Evidently that’s not what happened; the tip ages don’t prevent the node ages from moving around.

    And only if *most* of the branches suffer from the double-count of changes, but some of the more informative branches (like the ones with the tip dates wide apart) don’t, then we have a problem?

    Yes, but all branches should suffer equally from this, except that the loss of one synonym is only one change either way.

    (Sorry for my short explanations today. The weather is extremely tiring.)

  359. Trond Engen says

    David M.: Neither phonological nor morphological characters are used. The supplement laments that there are so few* and that they can’t be used for dating because, while vocabulary may generally change like molecular data do, morphological systems can stay pretty stable for millennia (Lithuanian is given as an example) and then suddenly collapse.

    That’s why I asked for phonological changes. I’d expect phonology to be less responsive to shifts in the environment than lexicon. Makeover or collapse of a morphological system could be driven by external forces like massive language replacement or bilingualism. Arguably it’s lexical replacement on steroids.

    Also (I haven’t checked if that’s mentioned), giving each of the lexical characters they actually used the same weight is more or less reasonable, because they’re all basic vocabulary and numerous; how to weight lexical, phonological and morphological characters relative to each other is anybody’s guess at this point.

    Constant lexical replacement rates is a huge and unjustified assumption. If one can’t take rates calculated in one language family, in one cultural system and one era, and use it on any other family, contemporary or in the distant past, why would it be acceptable to take rates calculated for one branch of a known family and apply to others, or for the recorded history of a language and apply to the reconstructed pre-history? Those are also set in very different cultural environments and have different language-internal constraints. Wouldn’t it be better to let the model weight characters based on calculated individual rates of replacement? With increasing lexical sets and increasingly sophisticated models, that should even allow calculation of different weights for different periods and branches, and it ought to be possible to set error bars based on variability (and co-variability) for any subset and the whole tree.

    (Prediction: The error bars on dates based on lexical replacement would turn out to be ridiculously large. But that would also be a useful result.)

  360. Constant lexical replacement rates is a huge and unjustified assumption.

    That’s what’s always bothered me about glottochronology.

  361. “Evidently that’s not what happened; the tip ages don’t prevent the node ages from moving around.”

    Evidently yes. I would put it differently:
    (1) having two changes instead of one results in a scaling, but not in a distortion of proportions of the tree.
    (2) calibration of two trees with identical proportions must give identical dates. By defiition of calibration.

    So if two changes instead of one affect the tree ANYHOW it must be tree topology or distribution of changes.
    Which is very strange.

  362. David Marjanović says

    That’s why I asked for phonological changes. I’d expect phonology to be less responsive to shifts in the environment than lexicon. Makeover or collapse of a morphological system could be driven by external forces like massive language replacement or bilingualism. Arguably it’s lexical replacement on steroids.

    But likewise, phonological systems can be partially or entirely replaced by those of a substrate.

    The complete lack of affricates (except the special case /kʂ/) and voiced fricatives, together with the presence of palatal plosives, in Sanskrit looks awfully Dravidian.

    The loss of consonant length and replacement of the fortis-lenis contrast by a voice contrast in Carinthian German is obviously Slovene.

    Wouldn’t it be better to let the model weight characters based on calculated individual rates of replacement?

    Actually, I have to check if that’s done. Molecular datasets are routinely divided into rate categories, and both which characters go into which category and what the rate (for each branch!) of each category is is calculated from the data.

  363. David Marjanović says

    (1) having two changes instead of one results in a scaling, but not in a distortion of proportions of the tree.

    Correct.

    (2) calibration of two trees with identical proportions must give identical dates. By defiition of calibration.

    Then tip-dating isn’t calibration. 😐 Only the attested languages are anchored in time in this paper; the nodes are not, and the root is only given a distant maximum of 10,000 B2K which is not reached.

    That’s what’s always bothered me about glottochronology.

    In molecular dating the assumption of a single constant rate was abandoned in the 1990s. Separate rates are calculated from the data for each branch and for each rate category of characters as I just mentioned.

    That’s why more than one calibration date is required!

  364. Trond Engen says

    David M.: But here, the only calibration date that isn’t a tip date is a maximum age of 10,000 B2K for everything.

    How do they avoid assuming the conclusion? The suggested age is suspiciously close to that. Say for simplicity that the model makes the most rugged path from one tip of a branch to the common root exactly 10 000 years long, and every time two branches are merged, the tree is shortened a little.

  365. Trond Engen says

    David M.: In molecular dating the assumption of a single constant rate was abandoned in the 1990s. Separate rates are calculated from the data for each branch and for each rate category of characters as I just mentioned.

    Yeah. But is it feasible with reasonable error bars for lexical data sets?

  366. “That’s what’s always bothered me about glottochronology.””

    @Trond, @LH, no.

    I really mean it. It is not so wild an assumption.

    Moreover, “relaxed clock” is also based on some assumption. Namely something like a random walk.
    It is a model too, an possibly even worse – or better. No way to know.

    Theoretically: of course some cross-linguistical average exists:-)

    I could write something long and philosophical, but imagine that linguists are betting on how many lexemes will change in some (obscure) language in the next 200 years. They WILL come up with some probability distribution. It WILL have some mean, median, maximum (perhaps more than one) etc.

    The question is only how wide it will be!

  367. Trond Engen says

    Everything (well, everything but a Cauchy distribution) has a mean. But for lexical replacement the mean itself will be defined by a stochastic process, with the auto-covariance defined by another stochastic process, and different lexical sets will have different means and variances, and covariances inside and outside the set, which all will be defined by stochastic processes. Even the definition of sets will be a stochastic process. These processes may be co-dependent, but to a degree that is itself a stochastic variable. All this may well be within the reach of some Markov type model, but how useful will it be? I guess another way to express my pessimism is that I don’t think it will converge to a stationary process for the intervals we want to use it for.

  368. @Trond, sorry, I was sleepy and did not read your whole paragraph, so I thought, like the authors you contast their method to glottochronology (here I disagree).

    It seems you don’t and have reservations about both.

  369. David Marjanović says

    How do they avoid assuming the conclusion? The suggested age is suspiciously close to that.

    I had indeed overlooked that, in the publicly accessible figure ( = fig. 2 of the paywalled paper), the posterior distribution of the root-node age “ends” precisely at 10k B2k. However, the authors had not:

    7.3 VARYING THE PARAMETERIZATION OF THE TREE PRIOR: CONDITIONING ON THE ORIGIN (SA3) We further tested the robustness of our results by varying aspects of the parameterization of the tree prior that are in principle open to alternative approaches, as covered here in §7.3, and in §7.4 below.

    In the main analysis we conditioned on the origin (the beginning of the root branch), as also in previous published analyses (12). This entails that we had to set an upper bound for the maximum age of Indo-European, which in our main analyses was set to 10,000 years. To assess whether this upper bound played a role in the date estimates, we applied a different parameterization without the origin boundary. In this sensitivity analysis SA3, we conditioned on the root (the first branching event) instead, and the result was an older average root age, with more uncertainty: 8900 BP (6750-11700 BP).

    (7.4 is about the tree prior itself; messing with it changes very little.)

    In short, you have a good point.

    It’s not quite as bad as it could be, because two of the dated languages are found to have a probability > 0.5 to be ancestors, so their ages are effectively maximum ages for the clades formed by their descendants (post-Bronze-Age Greek and modern Armenian). But still.

    I’ve actually published on the importance of maximum ages in molecular dating… if you don’t have enough, lots of nodes will come out too old.

    In molecular dating the assumption of a single constant rate was abandoned in the 1990s. Separate rates are calculated from the data for each branch and for each rate category of characters as I just mentioned.

    Yeah. But is it feasible with reasonable error bars for lexical data sets?

    Why would it be any harder than for molecular ones?

    Moreover, “relaxed clock” is also based on some assumption. Namely something like a random walk.

    A strict clock is a random walk. A relaxed clock is more like a random walk of random walks.

    I guess another way to express my pessimism is that I don’t think it will converge to a stationary process for the intervals we want to use it for.

    Oh, the programs for Bayesian inference let you check if convergence has been reached. If it hasn’t, you run your analysis for longer. (Or, in very rare cases, you eventually give up and try a different dataset.)

  370. @DM, but what does “fixed” mean?

    What you have is a list of related languages who have accumulated different number of innovations.
    How exactly you apply fixed clock here?

  371. Let it be two languages. One innovated 5 words since their split, the other innovated 1.
    Our actions?

    Or are we speaking about algorythms that do reconstructions (and which can decide that both langauges innovated three words each)?

  372. These are two different tasks.

    One situation is when we are given data on divergence between languages in a family. You can use this data to build a phylogeny and a reconstruction (of meaning-cognate pairs in the proto-language).
    And in this case you can assume a fixed clock within the family.

    The other situation is when you have phylogeny and language A innovated 5 words and B innovated 1 word, period. You can use information about other families to calibrate the date of split. E.g. just the average. But this mean (alongside with other characteristics of distribution) is not “fixed clock”.

    Moreover, it exists objectively.

  373. David Marjanović says

    @DM, but what does “fixed” mean?

    The word “fixed” only appears twice in this thread before your comment. The first is a quote from the paper about “fixed calibration points”, the second is from me and means “repaired”, “set right”.

    So what do you mean?

    How exactly you apply fixed clock here?

    You don’t. That’s what the “relaxed clock” stuff is all about. It’s also what the angry statement in section 1.3 of the supplement – that Bayesian tip-dating “is not lexicostatistics, and it is not glottochronology” and “has nothing to do with those outdated and widely discredited techniques” – is about.

  374. Trond Engen says

    drasvi: It seems you don’t and have reservations about both.

    Yes, but maybe not exactly the same and to the same degree. And I do think it’s a worthwhile effort. We just have to keep in mind that the results won’t be helpful until the methods are good enough (if they ever will be). I’ll maintain that what’s going on now is development of methods with IE as a test case. It’s just too bad that it has to be framed as solutions to IE problems.

    David M.: I’ve actually published on the importance of maximum ages in molecular dating… if you don’t have enough, lots of nodes will come out too old.

    That sounds … avoidable. Wouldn’t it it be possible to calibrate unknowns after knowns?

    Why would it be any harder than for molecular ones?

    Short timespans and small data sets. But I really don’t know what I’m talking about.

    Oh, the programs for Bayesian inference let you check if convergence has been reached. If it hasn’t, you run your analysis for longer. (Or, in very rare cases, you eventually give up and try a different dataset.)

    I’m not sure if we’re talking about the same convergence. I’m not even sure that I’m talking about the same convergence. First, I imagine that a hypothesis for a stochastic process that is itself the result of interlocked stochastic processes would be complex, and that it would take lots of data over long timespans for the inference to converge. Or maybe rather that the number of prior hypotheses is so big that it will take lots and lots of data to converge on one. Second, I imagine that if the calculations eventually converge on a stationary process and an expectation value (and higher order moments) for lexical replacement can be calculated, it will be variable on too long timescales to be used for dating of branches. Measuring the ripples and forgetting the tides.

  375. @DM, it was your strict clock, and I MEANT to type so. I don’t know how it mutated into “fixed”:(

    (There is a term ‘fixed local clock’, but I doubt my subconscious meant it)

    You don’t.

    Agree!
    Because it is physically impossible.

  376. It’s also what the angry statement in section 1.3 of the supplement

    This statement is what people do when they face a stigma. “I’m not a whore, Masha is a whore, I’m temporary wife!”

    But it is nonsense, and they confuse two unrelated things:

    Strict clock (an alternative to relaxed clock) and using cross-linguistical observations to calibrate a tree. The latter is complementary to relaxed clock.

    For IE you could choose between calibrating based on “cross-linguistical” observations and calibrating based on dates within the tree…. But when your “cross-linguistical” observations are observations about Indo-European – the two approaches are strictly identical.

    For most other families, you simply can’t feed your relaxed clock with “dates within the tree” because no dates. So you will feed the distribution from IE, Semitic etc.

    They are complementary.

  377. David Marjanović says

    Wouldn’t it it be possible to calibrate unknowns after knowns?

    Which knowns? In the fossil record, absence of evidence isn’t very often evidence of absence. (Sometimes it is good enough, and I’ve published on that.) With languages, there are fewer cases because writing is such a young technology.

    Agree!
    Because it is physically impossible.

    By no means is it physically impossible. Doing it will just give you wrong results in most cases.

    This statement is what people do when they face a stigma.

    That says nothing about whether the accusation is factually true if we ignore the stigma. The method used in this paper are really not similar to lexicostatistics or glottochronology.

    But it is nonsense, and they confuse two unrelated things:

    They don’t. You sound confused. I’m not sure what you’re confused about.

    using cross-linguistical observations to calibrate a tree

    A rate is not a calibration; you use calibrations to get rates. If you think a single rate fits the entire tree and you think you know this rate already, you don’t need any further calibrations.

    For most other families, you simply can’t feed your relaxed clock with “dates within the tree” because no dates. So you will feed the distribution from IE, Semitic etc.

    Of course not. If all tip ages are 0, you have to calibrate some nodes with very high confidence; and if you can’t do that, well, garbage in, garbage out. Wake me up when somebody tries…

  378. “By no means is it physically impossible”

    Again, there are two languages (currently spoken). One disagrees with the common proto-langauge in one word, the other in five.

    Use strict clock to obtain wrong results. I don’t see how this is possible logically, technically, anyhow. And telling that the flaw of some “glottochronology”, whatever they mean by this (apparently not the same thing as I – for I understand “glottochronology” as dating lanaguages based on statistics, especially based on retentions/innovations in basic vocabulary, but this is just a particularity of glottochronological methods common today. This paper absolutely is glottochronology – as I understand and always understood the word) is that they are preoccupied with something which is logically impossible to be preoccupied with is…. Crazy.

    Once you have phylogeny and reconstruction – and this is a common situation – your data contains both conservative and innovative languages.

    No matter what method you are using to date them, you have to face the difference.

  379. What you can do is:
    (1) move nodes up and down in relative time units.
    This won’t eliminate the variation, but it can minimise some function of variation.
    (2) equate your relative time units to a certain amount of actual years.

    Only in step 2 you can (if you like) use some mean obtained from other languages.

    In step 1 you can also use (if you like) other characteristics of the distribution obtained from other languages.
    IF you have real dates for more than one node in the tree, you also can use their relative difference during the step 1.

  380. “The method used in this paper are really not similar to lexicostatistics or glottochronology.”

    It is directly based on it!

    You accurately described the situation with lexicostatistics above. Indeed, methods used by linguists are very simple.

    G. Starostin compiles a list of cognates for an African family, derives the % of shared words between pairs and… thinks. Other considerations are linguistical rather than statistical.
    G. Starostin actually comaped the ratio of shared cognates in the upper part of the list (the most basic and stable words) and in the lower part (less stable) – the expectation is that if similarity is due to contact, then the ratio will be smaller. But he did not team up with his brother (a programmer) to devise something super-duper complex and make a program. Partly I suppose just because a lot of preparatory linguistical work must be done before complex algorytms become reliable.

    Biologists meanwhile use something more sophisticated for phylogeny. Nevertheless these sophisticated methods are improvement of simplified methods rather than “correct” idea replacing “wrong” idea.

    This paper is even directly based on Swadesh’s. They could independently come up with idea of applying statistical methods to infer phylogeny (lexicostatistics) or dating (glottochronology).

    But not to the idea of measuring exactly innovations/retentions in “basic vocabulary”. Here they follow Swadesh’s paper that they name as representative of “discredited” g.
    Besides they use Swadesh lists (they united 207 words from Swadesh’s lists with 100 items for L-J, obtained 235, eliminated those that don’t suit their purposes, got 170. Their shared property they say is “stable”. Swadesh in that publications has 165, again, “stable” and present in questionnaires).

    So respect your predecessors, say thank you, and say you can now do it better, because you have a computer. No, oops. “widely discredited”.

  381. David Marjanović says

    It is directly based on it!

    No!

    Emil Zuckerkandl had no idea of lexicostatistics in general or glottochronology in particular when he invented molecular dating in 1962! Even though he used a strict clock!

    This is a matter of actual history. Lexicostatistics & glottochronology was safely compartmented in “the humanities”, and molecular dating in “the sciences”, and both lived in complete ignorance of each other until 2012 when two molécularistes applied molecular dating to languages and all hell broke loose.

    Use strict clock to obtain wrong results. I don’t see how this is possible logically, technically, anyhow.

    The protolanguage will simply be reconstructed wrongly in the process.

    This paper absolutely is glottochronology – as I understand and always understood the word

    You misunderstand the word.

    No matter what method you are using to date them, you have to face the difference.

    Again: the program can, and does, infer a separate rate of evolution for each and every branch, including both the terminal branches and the internal branches ( = internodes).

    No, oops. “widely discredited”.

    Glottochronology means using a strict clock – a single average rate of evolution for the entire tree. Lexicostatistics, including glottochronology, means the data matrix is used only to calculate a distance matrix, and then the topology and (in glottochronology) the branch lengths are calculated from the distances, not directly from the characters. These are two very big differences. They lead to dramatically different results in many cases. As far as 21st-century molecular dating is concerned, glottochronology is a separate invention of the square wheel.

    Did you read the supplement? It’s in open access.

  382. a separate rate of evolution for each and every branch

    which doesn’t, it seems to me, solve any part of the problem, which is that these rates of change are – not hypothetically, but in observed fact – variable over time within any single language*, and variable within any branch of any proposed tree.

    i may be misunderstanding mathematics that are, i’ll admit, well beyond my pay-grade, but part of what seems to be happening here (and in glottochronology, to my eye) is an assumption that if you use large enough datasets you can act as if the underlying processes are random and make accurate models by proceeding based on that fiction. and that assumption seems to keep running head-on into the fact that language change is just not a random process, and breaking its nose.

    .
    * i bet a statistically-minded grad student could get a paper or two out of calculating rates of change in english by various methods and seeing what happens when you exclude changes first documented between 1590 and 1630 – and a few more if they managed to turn it into either a pro- or anti-stratfordian argument.

  383. David Marjanović says

    which doesn’t, it seems to me, solve any part of the problem, which is that these rates of change are – not hypothetically, but in observed fact – variable over time within any single language*, and variable within any branch of any proposed tree.

    That’s a separate problem, but you’re right that it isn’t necessarily solved this way.

    Similarly, in biology, it’s been proposed that natural selection is relaxed after certain kinds of mass extinction events and rates of evolution spike, leading to inferences of inflated ages unless there are enough calibrations that are precise enough. That would certainly explain a few obvious mismatches with the fossil record.

    and a few more if they managed to turn it into either a pro- or anti-stratfordian argument

    Heh.

  384. @DM,

    Claim 1: what glottochronologists speak about is not strict clock.

    Claim 2: strict clock is used in biology.

    Are you ready to say that strict clock in biology
    (a) “discredited”
    (b) “just doesn’t work” (your words which you use to gloss “discredited”. though I happen to see difference)

    (c) not a crude model which has evolved/been improved to become a family of more complex (still crude) models, including relaxed clock – not by biologists proper but by programmers, but because of the enthusiasm of biological community about dating, even with programs based on crudest models (unlike linguistic community which grumbles “imprecise” and discourages further work) – but literally “has nothing to do” with relaxed clock?

    If not, then why the difference?

    I understand that crowds are different. But in my view science is a space of ideas and questions (and people united by collaborative effort – and not just teams competing for grants, and interested in discrediting each other) and I understand a field (like glottochronology) based on its problems – not based on who’s looking for answer.

    ____
    Regarding claim 1:
    There is of course a certain range of variation. It is just a fact.

    The question is how wide is it? If we knew the range, we could say that “95% of langauages in the world who have 20% of replacements in basic vocabulary got them in no more than N and no less than M years”, and that would be just true.

    Would this N to M range be narrow enough to be informative (historically, linguistically, etc.)?
    If not, can further study of factors affecting variation help us narrow it down to ?

    All of this is simply not “strict clock”.

    Strict clock… it seems it is when you impose a restriction that “the clock is the same for all organisms in this set” (NOT corresponding to the average clock for all organisms”) and let your software recalculate either phylogeny or numbers of substitutions based on this artificial restriction.
    The software may introduce additional substitutions just in order to make some branches longer.

    Or when you exclude some lineage from your sample, because it is too weird.

    Linguists don’t do that.

  385. David Marjanović says

    Claim 2: strict clock is used in biology.

    Not in decades, except in very special cases (e.g. very short timeframes).

    Are you ready to say that strict clock in biology
    (a) “discredited”
    (b) “just doesn’t work”

    Yes and yes, except in very special cases.

    not by biologists proper but by programmers

    There’s a lot more overlap than you’d think.

    but literally “has nothing to do” with relaxed clock?

    They’re both “clocks”, but that’s it. Also, strict clocks have never been used in phylogenetic analyses in biology; that’s limited to glottochronology.

    Strict clock… it seems it is when you impose a restriction that “the clock is the same for all organisms in this set” (NOT corresponding to the average clock for all organisms”) and let your software recalculate either phylogeny or numbers of substitutions based on this artificial restriction.

    “Strict clock” means the model contains a single rate parameter that is applied to all branches. “Relaxed clock”, as used here, means a separate rate parameter is inferred for each branch (the terminal and the internal ones). That’s something that would take billions of years with pen & paper.

    If we knew the range

    We don’t need to assume we know it. We can let the program fit the model, including all the rate parameters in it, to the data.

    Really, almost every aspect of the method is an order or two more complex and computation-intensive than any glottochronologist, including the most careful and sophisticated ones like G. Starostin, seems to have ever imagined. That doesn’t make it immune to long-branch attraction or wrong approaches to coding or simple mistakes in the dataset, but it does take care of more issues than, again, any glottochronologist seems to have ever imagined.

    That’s no surprise. The community of people who have ever published on glottochronology, or lexicostatistics in general, is orders of magnitude smaller than the community of people who have ever published on molecular evolution or phylogenetics with molecular data. Of course the latter community is going to have more ideas.

  386. If we knew the range

    We don’t need to assume we know it.

    But we want to know. “How much variation there exists in the world” with respect to some parameter is absolutely normal question:) This is enough to write all those formulas.

    It’s just an attempt to understand langauge change, statistically.

    And then if the range is narrow enough (to be useful for, say, a historian) then the historian will be happier.
    If no, then all right, but at least we learn more about language change, which is not bad for a linguist.

  387. The main issues here are:

    – HOW ON EARTH we know it (but of course we have a good reason to study vocabulary replacement in langauges where it can be studied)
    – if it is wide,

    (a) can we make it more narrow by identifying and modelling various factors that contribute in it (thus including in the formula some f(language contact), f(community size) etc.) – as narrow as if we were tossing coin, which itself would give some dispersion – or
    (b) there are so many factors that we can’t make it narrow in principle.

  388. The formula itself is solid (if we don’t count some obvious improvements like accounting for different stability of different words) if:

    – some distribution exists (it exists)
    – time is not a factor that affects stability of vocabulary.

    This second reservation is interesting. There was a proposal that words can wear with time (stability of a word depends on how recent it is) – and this will modify the formula.
    If this is true, it is a more exotic option than if it is false.

    But it’s a formula for the mean.

  389. David Marjanović says

    The formula itself is solid

    What formula? That’s not how this method works.

  390. Incidentally I was just pointed to a facebook thread where this question is discussed by Starostin, Kassian and Militarev. Feel free to chime in if needed. They didn’t really drill it down to details, more like dismissed it out of hand as Atkinson v.2
    https://www.facebook.com/borinskaya/posts/pfbid0zsYv13kL9oXkHcvpsvNZK3RAkSJX79xutUe8W5fEJRvsnhDNshXaM24x46KCTttCl

  391. David Marjanović says

    Thanks, soon I’ll have to visit Farcebork for the first time in over a year anyway. 🙂 Atkinson 2.0 isn’t far from what it is!

  392. @DM, the initial glottochrono-formula. I think it is what people mostly know about G.

    There are other formulas, but this one is the most famous and I think it confuses people.

    @DP, oh, that’s funny.
    I (to no avail) was trying to understand how glottochronologists apply the formula when they date families (I believe it can’t be ‘strict clock’, but wanted a confirmation) and was looking for various publications. So I tried to browse those about Afro-Asiatic and when you posted your link I was looking for Alexander Militarev, “Towards the chronology of Afrasian (Afroasiatic) and its daughter families” in Renfrew, C., McMahon, A., & L. Trask, Eds. Time Depth in Historical LInguistics (to no avail again).

    In this FB thread he names a more recent paper (and a draft), but agian, does not explain the method.

  393. G. Starostin calls the formula he is using (in one of his methods, the other is called “Bayesian MCMC” whatever it means) “strict clock”.
    The clock he is using runs faster with time. :-/
    Mad.

  394. Trond Engen says

    That may well be justified by theory. You might want to adjust for higher frequency of replacement with wider and faster networks for technological and cultural exchange. You might also want your model to reflect that a thinly documented (stage of a) language will have had undocumented lexical change. But “runs faster with time” sounds like a very crude approximation.

  395. Trond, no, I don’t object to the formula (see WP, Glottochronology#Modifications).

    But I’m trying to understand what people mean by “strict clock”.
    The phrase appears here:
    https://www.degruyter.com/document/doi/10.1515/ling-2020-0060/html

  396. David Marjanović says

    @DM, the initial glottochrono-formula. I think it is what people mostly know about G.

    There are other formulas, but this one is the most famous and I think it confuses people.

    Ah, OK.

    But I’m trying to understand what people mean by “strict clock”.

    The opposite of “relaxed clock”. 🙂 The Wikipedia article “Molecular clock” seems good to me, from looking at it very briefly.

    “Bayesian MCMC” whatever it means

    Just that in mathematical terms it’s a Markov-chain Monte Carlo simulation.

    (Even a Metropolis-coupled Markov-chain Monte Carlo simulation, or MCMCMC or MC³.)

  397. The Metropolis algorithm doesn’t model change over time though. It’s purpose (and, indeed, the purpose of the vast majority of Markov chain Monte Carlo calculations) is to locate the temporarily stable distribution of the Markov chain and the fluctuations around that stable distribution. In the Metropolis algorithm, this is also a way of calculating a canonical partition function.

  398. David Marjanović says

    Yes; the distribution of trees and their branch lengths is what’s calculated this way.

  399. In linguistics the discipline that calculates it is “comparative linguistics”, not “glottochronology”.

  400. January First-of-May says

    Even a Metropolis-coupled Markov-chain Monte Carlo simulation, or MCMCMC or MC³.

    So is there such a thing as a Metropolis-coupled Markov-chain Monte Carlo molecular clock, to make MC⁴?

    (I’m sure I’m absolutely mixing up the notations here but the idea is hilarious)

  401. Stu Clayton says

    Someone has to keep the show running smoothly: the Master of Ceremonies for a Metropolis-coupled Markov-chain Monte Carlo molecular clock simulation, or MC + MC⁴. Also known as a Master Control Program.

  402. David Marjanović says

    🙂

  403. “You shouldn’t have come back, Clayton.”

  404. Yes, my yesterday’s unfinished answer to DM was

    @DM, yes, I know what MCMC means. MC Bayes and DJ Markov. Though i did not know the longer name:) I ironise because I find the name very inconvenient. Of course if your software calculates probabilities it can be Bayesian, and if it is you will mention that fact when describing what it does.

    Did you read the supplement?” – Partly. Now I’m reading publications by molecular biologists.

    “Yes and yes, except in very special cases.” – thank you!

    …etc. [the unfinished part]

  405. Trond Engen says

    Connecting the dots? Filling out the picture? Adding background texture? I don’t know yet. These things come faster than I can read.

    Tiina M. Mattila et al: Genetic continuity, isolation, and gene flow in Stone Age Central and Eastern Europe, Communications Biology (2023):

    Abstract
    The genomic landscape of Stone Age Europe was shaped by multiple migratory waves and population replacements, but different regions do not all show similar patterns. To refine our understanding of the population dynamics before and after the dawn of the Neolithic, we generated and analyzed genomic sequence data from human remains of 56 individuals from the Mesolithic, Neolithic, and Eneolithic across Central and Eastern Europe. We found that Mesolithic European populations formed a geographically widespread isolation-by-distance zone ranging from Central Europe to Siberia, which was already established 10,000 years ago. We found contrasting patterns of population continuity during the Neolithic transition: people around the lower Dnipro Valley region, Ukraine, showed continuity over 4000 years, from the Mesolithic to the end of the Neolithic, in contrast to almost all other parts of Europe where population turnover drove this cultural change, including vast areas of Central Europe and around the Danube River.

    […]

    Conclusions
    In this study, we have investigated the genetic landscape of Central and Eastern Europe before and after the European Neolithic expansion. One of the most striking findings was that before the dawn of the European Neolithic, Central and Eastern Europe was inhabited by a population that descends from a gradient admixture population between genetically distinct West European and Siberian hunter-gatherer groups. Such a pattern suggests long distance population genetic connectivity, likely via a stepping-stone admixture model. The genetic descendants of these Mesolithic populations were in many areas assimilated or replaced by incoming farmers during the Neolithic, and the genetic group common during the late Mesolithic remained dominant only in the East and Northeast European frontier and some geographical regions in Southern Scandinavia. In the lower Dnipro Valley region in Ukraine, the direct descendants of the Mesolithic population continued being the dominant group for thousands of years after the start of the European Neolithization, and the end of this continuity was associated with the Eneolithic/Bronze Age migration wave from the East. Hence, we conclude that the Dnipro Valley region’s Neolithic cultural innovations, such as adoption of pottery (further from pointed-bottom vessels to flat bottomed ones), pioneer animal husbandry (cattle, pig, sheep & goat, agriculture e.g., barley)48 and the changes from contracted to extended supine burials were not associated with gene flow from Anatolia. This is opposite to the pattern observed in regions further to the west, where Neolithic transition was associated with large-scale migration.

    Our analysis of close genetic relatedness, on the one hand, revealed the role of genetic relatedness in burial practices in cultures across Mesolithic, Neolithic, and Eneolithic Europe. One the other hand, the results also pointed to a possibility of non-genetic connections such as in the Neolithic Late Lengyel culture Kruza Zamkowa case exemplified here. These observations, together with previous investigations of close kin relations in the Stone Age, suggest a variety of different views and practices of biological and potentially non-biological kin relations.

    Maciej Chyleński et al: Patrilocality and hunter-gatherer-related ancestry of populations in East-Central Europe during the Middle Bronze Age, Nature Communications (2023):

    Abstract
    The demographic history of East-Central Europe after the Neolithic period remains poorly explored, despite this region being on the confluence of various ecological zones and cultural entities. Here, the descendants of societies associated with steppe pastoralists form Early Bronze Age were followed by Middle Bronze Age populations displaying unique characteristics. Particularly, the predominance of collective burials, the scale of which, was previously seen only in the Neolithic. The extent to which this re-emergence of older traditions is a result of genetic shift or social changes in the MBA is a subject of debate. Here by analysing 91 newly generated genomes from Bronze Age individuals from present Poland and Ukraine, we discovered that Middle Bronze Age populations were formed by an additional admixture event involving a population with relatively high proportions of genetic component associated with European hunter-gatherers and that their social structure was based on, primarily patrilocal, multigenerational kin-groups.

    Apart from being published almost simultaneously, the two studies look pretty parallel at first glance, and both have a last named & corresponding author from Human Evolution, Department of Organismal Biology, Uppsala University.

  406. David Marjanović says

    Awesome.

  407. Dmitry Pruss says

    Sredni Stog as a fusion population makes total sense. I didn’t yet figure out what archaeological culture is called Dnipro Neolithic in the paper. Too busy days at work to keep up with the flow of interesting research.

  408. John Cowan says

    Dnipro Neolithic

    … a Ukrainian soldier who has lost his rifle.

  409. The archaeological context is in the 1st supplement, staring from page 21
    https://static-content.springer.com/esm/art%3A10.1038%2Fs42003-023-05131-3/MediaObjects/42003_2023_5131_MOESM2_ESM.pdf

    The intro describes the Neolithic culture of the Dnieper area as Tripolye (which makes sense because Tripolye town is on the Dnieper riverbank. The very name is loaded for me. It is a part of the family lore of my mother’s kin, the Gonikbergs, one of whom was conscripted in WWI, spent years in a Hungarian POW camp, returned to an economically devasted and pogromed shtetle, escaped to Kiev only to be drafted again into the Red troops, and was killed in Tripolye in a mass murder of the Red POWs, who were bayonetted at a steep riverbank and thrown into Dnieper to die).

    Several samples were already sequenced and reported in Mathieson 2018 ( https://www.nature.com/articles/nature25778 )
    Most are new. They are from river bluff cemeteries along the Dnieper. The digs date back to the 1950s and 1960s and the documentation is somewhat limited. The relevant references are below, with the umbrella publication covering all sites at #123

    120 Telegin, D. Ya. Deriivka. A Settlement and Cemetery of Copper Age Horse Keepers on the
    Middle Dnieper, BAR International Series 287. (BAR, 1986).
    121 Danilenko, V. N. Neolit Ukrainy: glavy drevnej istorii ûgo-vostočnoj Evropy (Naukova
    Dumka, 1969).
    122 Zalizniak, L. L. Mesolithic Origins of the First Indo-European Cultures in Europe
    According to the Archaeological Data. Ukrainian Archaeology: selected papers from
    Ukrainian journal Arkheolohiia 2016, 26-42 (2017).
    123 Телегин, Д. Я. Неолитические могильники мариупольского типа. (Наукова Думка,
    1991).
    124 Telegin, D. Ya. & Potekhina, I. D. Neolithic cemeteries and populations in the Dnieper
    Basin. BAR International Series 383 (BAR, 1987).
    125 Potekhina, I.D. The prehistoric populations of Ukraine: Population dynamics and group
    composition. In: Lillie, M. and Potekhina, I.D (Eds.) Prehistoric Ukraine. From the First
    Hunters to the first Farmers. 155-186 (Oxbow Books, 2020).
    126 Потехина, И. Д. К антропологической характеристике Дереивского неолитического
    могильника. in Использование методов естественных наук в археологии (ed. Генинг, В.
    Ф.) 109–128 (Наукова Думка, 1978).
    127 Зиневич, Г. П. Очерки о палеоантропологии Украины. (Наукова Думка, 1967).
    128 Lillie, M. C. The Dnieper Rapids Region of Ukraine: A Consideration of Chronology,
    Dental Pathology and Diet at the Mesolithic-Neolithic Transition (Sheffield University,
    1998).

    BTW comments which don’t reflect the values of this community are kind of like eyesores…

  410. Going through Telegin’s 1991 book.
    https://www.academia.edu/31335267/%D0%94_%D0%AF_%D0%A2%D0%B5%D0%BB%D0%B5%D0%B3i%D0%BD_%D0%9D%D0%B5%D0%BE%D0%BB%D0%B8%D1%82%D0%B8%D1%87%D0%B5%D1%81%D0%BA%D0%B8%D0%B5_%D0%BC%D0%BE%D0%B3%D0%B8%D0%BB%D1%8C%D0%BD%D0%B8%D0%BA%D0%B8_%D0%BC%D0%B0%D1%80%D0%B8%D1%83%D0%BF%D0%BE%D0%BB%D1%8C%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D1%82%D0%B8%D0%BF%D0%B0_Neolithic_burial_Mariupol_type_Neolitick%C3%A1_poh%C5%99ebi%C5%A1t%C4%9B_Maripolsk%C3%A9ho_typu

    Mariupol Style Burial Culture is a subset of Dnieper-Donets Culture & appears to be distinct from the classic Cucuteni-Tripolye (and very distinct from LBK). It’s absent West of Lower Dnieper but present in Crimea and Azov Sea shores. Its ocher-rich burials with the bodies laid flat on their backs are linked to the Mesolithic Baltic, while necklaces of flat rings of black jet stone and elk teeth brooches are among the influences from the Caucasus.

    Mathieson 2018 already demonstrated that the Lower Dnieper Neolithics were genetically on the grand hunter-gatherer cline, and extremely different from the Cucuteni-Tripolye (represented in the paper by the 4 remains sets from Verteba Cave in Western Ukraine, and harboring ~80% Anatolian Farmer ancestry).

    The broader Dnieper-Donets culture is linked by Telegin to the Mesolithic sites in Belarus.

  411. BTW comments which don’t reflect the values of this community are kind of like eyesores…

    I’m sure JC was just making a spur-of-the-moment joke and didn’t mean any harm by it. I agree it wasn’t especially well timed.

  412. Trond Engen says

    No, this is background:

    Allentoft et al: Population Genomics of Stone Age Eurasia, bioRxiv preprint (2022):

    Summary
    Several major migrations and population turnover events during the later Stone Age (after c. 11,000 cal. BP) are believed to have shaped the contemporary population genetic diversity in Eurasia. While the genetic impacts of these migrations have been investigated on regional scales, a detailed understanding of their spatiotemporal dynamics both within and between major geographic regions across Northern Eurasia remains largely elusive. Here, we present the largest shotgun-sequenced genomic dataset from the Stone Age to date, representing 317 primarily Mesolithic and Neolithic individuals from across Eurasia, with associated radiocarbon dates, stable isotope data, and pollen records. Using recent advances, we imputed >1,600 ancient genomes to obtain accurate diploid genotypes, enabling previously unachievable fine-grained population structure inferences. We show that 1) Eurasian Mesolitic hunter-gatherers were more genetically diverse than previously known, and deeply divergent between the west and the east; 2) Hitherto genetically undescribed hunter-gatherers from the Middle Don region contributed significant ancestry to the later Yamnaya steppe pastoralists; 3) The genetic impact of the transition from Mesolithic hunter-gatherers to Neolithic farmers was highly distinct, east and west of a “Great Divide” boundary zone extending from the Black Sea to the Baltic, with large-scale shifts in genetic ancestry to the west. This include an almost complete replacement of hunter-gatherers in Denmark, but no substantial shifts during the same period further to the east; 4) Within-group relatedness changes substantially during the Neolithic transition in the west, where clusters of Neolithic farmer-associated individuals show overall reduced relatedness, while genetic relatedness remains high until ~4,000 BP in the east, consistent with a much longer persistence of smaller localised hunter-gatherer groups; 5) A fast-paced second major genetic transformation beginning around 5,000 BP, with Steppe-related ancestry reaching most parts of Europe within a 1,000 years span. Local Neolithic farmers admixed with incoming pastoralists in most parts of Europe, whereas Scandinavia experienced another near-complete population replacement, with similar dramatic turnover-patterns also evident in western Siberia; 6) Extensive regional differences in the ancestry components related to these early events remain visible to this day, even within countries (research conducted using the UK Biobank resource). Neolithic farmer ancestry is highest in southern and eastern England while Steppe-related ancestry is highest in the Celtic populations of Scotland, Wales, and Cornwall. Overall, our findings show that although the Stone-Age migrations have been important in shaping contemporary genetic diversity in Eurasia, their dynamics and impact were geographically highly heterogeneous.

    Caveat: It’s a preprint from last year, and it doesn’t seem to have been published yet. That may mean something. OTOH, pretty much every name attached to the Copenhagen lab is on it, so it ought to be solid enough.

    Where do I start? Methods keep improving. They now have tools to work statistically with much smaller samples of ancient DNA than before, meaning that a lot more material, esp. old material, can be analysed. And so they do.

    One of the questions I’ve been having is why the Western Eurasian Paleolithic samples have been clustering in just a small region in the center of the PCA plots. If I understand it correctly, they now see that this is an artifact of small samples and plotting on to a map of modern populations. The large paleolithic genetic variation is along other axes than those that are important for more recent populations.

    They identify a “HG Ukraine” element that is a major element in Mesolithic populations in the Baltic and peninsular Scandinavia. It’s also significant in Iberia, mainly in the far north and west. It contributed much less to Mesolithic Denmark and France, and not at all (or insignificantly) to Italy and the British Isles. One explanation is that this is the people that first reached and made use of the Atlantic marine resources, and then spread south along the coast, while the “Italians” settled the inland and in some regions eventually replaced it completely. I remember a weird mt-DNA haplotype that has it maximums in Senegal, Iberia and the Sami. Could that be connected? Anyhow, while the Mesolithic comes out largely as a time of continuity and cultural diffusion in Central/Western Europe, its initial phase must have been much more complex. Scandinavia and the Baltic seem to show something very different, with full or partial replacements during the Mesolithic. Or are those patterns really regional variation that shows up as temporal change because of skewed sampling?

    From the start of the Neolithic, several full or near full population replacements can be shown. Here I’ll point to the really cool graphs, especially F3 and F4. Look at the clear breaks in time and space when the Anatolian Farmers arrive. Look at how local ancestry resurges (or not) after a while. Look how it pops up in single samples long after its apparent replacement. Look how the farmers are replaced by people with Steppe ancestry.

    They identify a Mesolithic population at the middle Don that “contributed significantly to Yamnaya”. Not only significantly, methinks. It could be the source population. Or at least: When this population is used as a source element, it shows up in later populations as the major part of the Steppe component. The other elements are Caucasus HG and Anatolian Farmers.

    But look how the Steppe mix changes in time and space. The first wave, with more Caucasus and nearly without AF ancestry, is seen in Siberia, as should be expected, but also in the Baltic. The Baltic is interesting. After a couple of centuries this population is changed, and it’s the Caucasus component that is reduced. It seems to me that it’s been admixed about 50/50 with a closely related population without the Caucasus element (or replaced by such a mix), resulting in what’s essentially the Corded Ware people.

    But while Corded Ware spread all over, in the Baltic it’s replaced again by what seems to be its CHG-free source. I wonder where that came from. I would say Globular Amphora, but it can’t be that, or at least not all of that, since it’s not in Poland before the Yamnaya expansion. I think it’s hiding in the Forest Steppe, or maybe in Belarus. It might be a more northern wave of early Yamnaya pastoralists, lurking unsampled on the outskirts of GAC until its southern cousins arrive. I also think it’s the population that brings R1-a to Corded Ware and beyond.

    Scandinavia also had several population waves after the arrival of Steppe ancestry, dominated by different male lineages. The first is R1-a Corded Ware, then almost full replacement with Central/Western European R1-b, and then I1, which I think must come from the Pitted Ware people, those stubborn marine harvesters of the Baltic. We could actually hypothesize that Germanic is what happened when PWC peoples adopted Indo-European from their neighbors and trade in amber and other northern goods for bronze made them rich and territorially expansive.

    This feeds into this and this.

    There’s more, but I have to stop somewhere.

  413. Thanks for that very clear analysis! (“Clear” = I can understand most of it.)

  414. Stu Clayton says

    There’s more, but I have to stop somewhere.

    [When the second volume of that work appeared, it was quite in order that it should be presented to His Royal Highness in like manner. The prince received the author with much good nature and affability, saying to him, as he laid the quarto on the table:]

    Another d-mn’d, thick, square comment ! Always scribble, scribble, scribble! Eh! Mr Engen ?

    #
    . . . I have presumed to mark the moment of conception: I shall now commemorate the hour of my final deliverance. It was on the day, or rather night, of the 27th of June, 1787, between the hours of eleven and twelve, that I wrote the last lines of the last page, in a summer-house in my garden [At Lausanne]. After laying down my pen, I took several turns in a berceau, or covered walk of acacias, which commands a prospect of the country, the lake, and the mountains. The air was temperate, the sky was serene, the silver orb of the moon was reflected from the waters, and all nature was silent. I will not dissemble the first emotion of joy on recovery of my freedom, and, perhaps, the establishment of my fame.
    — Autobiography of Edward Gibbon (World’s Classics edn.,1907), pp. 155-9,160,205.
    #

    Anecdotes about Edward Gibbon

  415. John Cowan says

    All this talk of “complete replacement”, much less multiple complete replacements, seems to me to be going far beyond what evidence we could possibly have. What kind of sense does it make to look at a couple of dozen people in one place at time t₀ and another couple of dozen people in another place at time t₁ who don’t have the same haplotypes, and conclude that the population of Scandinavia (or whatever large geographic area) was totally replaced between t₀ and t₁? If we compare the language of Lancaster County, PA in the early 18C with the language of South Philadelphia today, can we conclude that the “language of Pennsylvania” was formerly Pennsylvania German, but that this has been totally replaced by AAVE?

  416. Trond Engen says

    A dozen of people here and a dozen there at another time says nothing. But an increasingly fine-grained net in time and space says more. But what does it say? Nothing directly about language, but much about populations, and populations are made of people, and people have language. People learn that language (or those languages) from people around them, often their parents. They also get their genes from people around them, even more often their parents, so transmission of language correlates strongly with transmission of genes. It’s easier for a person to acquire or lose language than genes, so the correlation isn’t perfect, and it’s reason to believe that it’s weaker with time. What does that mean? Abrupt changes in the genetic makeup of a geographical area are likely to mean change in language, just as it means change in archaeological culture. Slow changes has less to tell, but since language change often follows economic opportunities or social prestige, some scenarios are more likely than others.

    If the genetic traces of a population disappears from the archaeological record without ever reappearing, that’s a population replacement. There’s very little chance that the language of the disappeared aboriginals would have been taken up by the newcomers, so unless they already had been in contact for a long time, that would seem to mean language shift. Full or near full replacement really seems to have happened in Scandinavia at least a couple of times. Nothing is known for certain about the mechanisms, but we can imagine that the hunter-gatherers were flexible enough to be pushed out to the outskirts by the first farmers. When the first farmers in turn were replaced by Corded Ware-related people from the continent, depopulation from the plague or other epidemics is suspected to have been important.

    When a population suddenly disappears from the record and then reappears, maybe gradually as a minority element in later population, it’s not a complete replacement, but a temporary marginalization. Even here it’s hard to see how that would transfer language from the aboriginals to the newcomers. Maybe the marginalized hunter–gatherers of Scandinavia would have made a gradual rebound if the farmers had gotten more than a few centuries, but we don’t know. Or actually we do, because we do carry a proportion of their genes today, but we don’t speak their language.

    When what you see is a new population that merges native and newcomer ancestry, that’s admixture for the geneticists. Which language (or languages) that end up being spoken is a hard guess. Proportions of ancestry, economic advantage and cultural prestige are at least indicators, but they don’t always point in the same direction, and even if they do it’s not determinism. In the Iberian peninsula the male line of the First Farmers was almost completely replaced by Yamnaya in a couple of generations. That’s not a full replacement, since most of the total gene pool is from the local females, but it also tells of social prestige and economic advantage. A lot of advantage and prestige, because it’s incredibly fast. Still, the Basques speak Basque. Was that the language of the mothers? Dmitry once suggested that Basque came in a first wave from the Steppe and actually was the language that replaced that of the First Farmers. If so, it didn’t replace the native languages in all Iberia, or other languages came later and replaced it in much of the peninsula without the same sort of massive event.

    In Scandinavia, after the Corded Ware wave, there doesn’t seem to have been any full population replacements, but there are huge changes in the male line along with smaller changes in the overall genetic makeup, probably meaning that new people came in and gained a social advantage also here. We can’t know which of all these population waves that carried language, much less which language they carried, or exactly which wave that brought (Pre-)Proto-Germanic. At least not yet, and maybe never.

    And finally yes, there is a clear risk of over-interpreting small samples. That’s why I end the paragraph on Mesolithic replacements in Scandinavia and the Baltic with a question.

  417. Dmitry Pruss says

    I worry more not about the samples sizes of the ancient genomes (they are becoming unassailable) but about systematic biases of sampling. If a culture cremated its dead, or buried it in acidic soil (commonplace in wooded Northern Europe) or in scattered and inauspicious places, then it will tend to get lost from sight.
    Living fossils have been discovered time and time again in biology.
    Languages, cultures, and populations can and do survive in small, undesirable habitats (although it’s a lot harder for civilizations relying on large masses of people and specialized labor than for humble foragers).
    So an observed replacement may have been far less complete than it seemed, at the time frames in the analysis. Western hunter-gatherer ancestry made a quite spectacular comeback in Late Neolithic, so it wasn’t gone for good …

  418. Trond Engen says

    Sure. The fact that a skeleton is preserved makes it special enough that it shouldn’t be taken as completely representative — and more so the more rare it is.

    I also agree that complete population replacement is an extreme event, but it does happen. Newfoundland and Tasmania are recent examples of marginalization of the native population leading to quick extinction. Other, less isolated populations may be experiencing rebound.

  419. Tasmania … examples of marginalization of the native population leading to quick extinction.

    Hmm. In Tasmania that needed firearms.

    Wouldn’t an invader enslave the local population? Yes, that’d maybe lead to no preservation of their skeletons; but it might will give some language transfer. At least vocab for unfamiliar (to the invaders) stuff.

  420. Trond Engen says

    Firearms, and disease, and marginalization.

    Enslavement is one option. But maybe you just want Canaan and not the Canaanites.

  421. Trond Engen says

    Back to Allentoft et al’s F4.

    I like that there are timelines of isotope levels. The sudden change in diet away from marine sources with the arrival of the farmers is very visible in the δ13C and δ15N plots. The Strontium plot may show that the farmers were more mobile during their lifetime, or that they had access to (or had to resort to) more varied food sources.

    They even plot a timeline of vegetation cover inferred from pollen analysis of lake sediments. My first thought was that something dramatic must have happened to the Neolithic economy around 5000 BP, and even more just before the Battle Axe transition, but apparently not. This timeline is from a single lake in Zealand, and the changes probably reflect the local slash-and-burn-cycles. The end of the last clearing episode, around 4850 BP, could be different though, since it’s followed by primary forest.regrowth. This lasts for a century until the first people with Steppe ancestry arrive. The neolithic chamber grave in Gökhem/Falköping in Sweden where Yersinia pestis infection was found in several of the skeletons is dated to ~4900 BP.

    The plots of physical type are neat too, although I’d have thought it would be even more interesting to look for and plot genetic adaptations to diet and disease. And for the actual diseases, obviously. I imagined testing for germs and antibodies was standard by now.

  422. Trond Engen says

    Also, to tie this up with the plague theme upthread: There’s evidence that riverine hunter-gatherers caught Y. pseudotuberculosis from beavers. Allentoft et al have identified a possible source population for the Yamnaya. living as hunter-gatherers at the middle Don. It’s just the type of population you’d expect to develop genetic resistance to Y. pseudotuberculosis and presumably even early strains of Y. pestis

  423. David Marjanović says

    (I’m disappointed to find there’s no Gökhem in Turkey.)

  424. @Trond Engen: Your mention of Canaan actually made me think about how hard it probably is to have wholesale population replacement without either modern military and transportation technologies, or previous isolation that made one people particularly vulnerable to diseases carried by the other. In spite of the lore that the Canaanites were either slain or driven out of the large swaths of territory by the Hebrews (who later retconned their history to make this kind of genocidal behavior not merely necessary, but an affirmative goal), there is little evidence of significant genetic changes among the Afro-Asiatic people of the Levant in the second millennium B. C. E.

  425. Trond Engen says

    True enough. It goes to show how extreme a full replacement is. Or how extreme it is if it’s achieved by manual action. In the case of Scandinavia. it’s not actually a complete replacement in the whole peninsula, but in the archaeological record from the southern/central parts, and it wasn’t necessarily violent (but it certainly may have been).

    The first shift might have been voluntary, in the sense that the hunter-gatherers just moved away from the newcomers. They were probably used to move when resources became sparse or the neighborhood unpleasant. If there were cases of armed resistance, the aboriginals were too few and far between to win in the long run. Some moved far away, others would have stayed on the outskirts and established relations with the farming economy, but may have been slowly wiped out by new diseases (and alcohol?). The SHG genes did eventually rebound, but probably by way of the Pitted Ware Culture*.

    The second shift may have been repopulation after a devastating pandemic. Maybe there were so few of the Funnel Beaker farmers left that their genes were crowded out in a couple of generations without any trace in the record. If Y. pestis was endemic, they would have had an evolutionary disadvantage, and if it was noticed that children of mixed marriages were at high risk, that disadvantage might have been compounded by stigma.

    * The PWC was something else. It was based on harvesting economy and neolithic technology and took up genes from both Funnel Beaker and Corded Ware. I believe that its establishment might have been a deliberate political response to encroaching farmers and that it may have put up a common organization and even some sort of territorial defense. If there was a pandemic in 4900 BP, it weathered it better than the farmers and expanded into former Funnel Beaker realms. After the Yamnaya intrusion it stayed “independent” for centuries, until it eventually became a founding member of the Nordic Bronze Age culture. I don’t know who took over who in that transaction. Or that’s what I’ve been thinking. It bothers me that there’s no visible Pitted Ware period in the Scandinavian timeline.

  426. John Cowan says

    Tasmania … examples of marginalization of the native population leading to quick extinction.

    There are no full-blood Native Tasmanians now, but there are rather more N.T.s than there were in 1803 when British colonization began. so they are not extinct in a genetic sense. Their languages are extinct, it’s true.

  427. Trond Engen says

    Me: (and alcohol?)

    No, forget that. Alcohol can’t have been very available as a traded commodity.

    But I do wonder if and when there’s been selection for genetic resistance to alcoholic diseases in populations that actually used and produced alcohol. If there is such a thing at all. Maybe cheap liquor just hasn’t been around long enough for selection to work.

  428. Trond Engen says

    @John C.: You’re right. I was wrong. It seems that modern examples of replacements with no genetic legacy are very hard to find.

  429. Christopher Culver says

    … may have been slowly wiped out by new diseases (and alcohol?)

    Is there any case of an indigenous population being severely impacted by alcohol prior to the invention of distillation? It seems hard to believe that just beer/wine/kumis could have the same detrimental effects that we saw with Native Americans and real firewater.

  430. just beer

    In early settler New Zealand, the only way to keep ‘water’ safe to drink was via fermentation. From early accounts, it seems whalers/sealers/gold panners/lumberers (if that’s the word) spent most of their time slightly sozzled.

    OTOH as fast as those ne’er-do-wells got here, the priests were hot on their heels (having moved through the Pacific Islands already). Their temperance got in amongst the natives pretty thoroughly.

    So here it was the guns and disease formula.

  431. Trond Engen says

    Yeah. I slept on the idea and took it back. What I mean is that it would be interesting to see if there’s been selection on genes associated with risk of alcoholism.

  432. Lars Mathiesen (he/him/his) says

    A relatively distant relation was adopted from SE Asia, and as a young adult they developed a alcohol dependency. At the time the family was living in the US, and the “system” told them that there was a genetic predisposition in the adoptee’s ethnicity. (The parents used it as a “not our fault” card, outside observers found it prudent to keep the peace).

    (This was in the 90s, and I doubt there was any scientific backing that we’d accept today).

  433. Back to the topic of new publications about the Bronze Age Eastern Europe.
    A paper on Fatyanovo and Abashevo was published in Russian in May but just got to my attention
    https://www.researchgate.net/publication/370414570_Ancient_DNA_of_the_Bearers_of_the_Fatyanovo_and_Abashevo_Cultures_Concerning_Migrations_of_the_Bronze_Age_people_in_the_Forest_Belt_on_the_Russian_Plain

    It appears to disprove the notion that Fatyanovo was a direct predecessor of Abashevo. The signature Abashevo burial of men killed in a battle at Pepkino Mound in today’s Mari-El (Middle Volga) is united by burial customs but turns out to be a bipartite amalgamated society, with relatives buried near each other.
    One end of the burial trench is occupied by bodies of young men who, as per isotopic composition, grew up a few hundred miles to the West in Oka Basin. They frequently harbor mtDNA haplogroups dating back to Neolithic Europeans, and typical for Corded Ware, while their Y chromosomes are shared with Fatyanovo and some later East Steppe peoples. Their grave goods include various bone implements.
    In contrast, the other side of the grave includes a man buried with bronze metallurgy tools, and hailing from the Ural mountains according to the isotope composition. He and his neighbors share a different branch of R1a Y-chromosomes, common with Sintashta and several other Steppe peoples.
    The metal of Abashevo has been reported to come from the Southern Ural mines, while their pottery and jewelry styles are decidedly Corded Ware (and distinct from Fatyanovo).

    So the conclusion appears to be that after Fatyanovo, there were other waves of migrations from ~ Germany to the Russian plain, and that Abashevo peoples made the migration over the course of two or more generations, aided by a tribal merger with a local bronze-making people.

    (Such amalgamations of professional casts are not unheard of, just think of my and Trond’s fav Seima-Turbino phenomenon)

  434. David Marjanović says

    LBK? Not Corded Ware?

  435. Dmitry Pruss says

    Right, my mistake but it’s too late to edit it

  436. I’m happy to fix it, but should it be changed both times?

  437. Dmitry Pruss says

    Thank you LH, yes

  438. @Trond Engen: The rates of alcoholism among American Indians (about six times the national average in the United States, last time I checked) seems pretty unlikely to be a chance effect. Distilled liquors are only a few centuries old, so that probably is not what people in the Old World adapted to. However, drinks with moderate alcohol content like wine are commonplace in the Old World, in a way they probably never were in the pre-Columbian Americas.

  439. The rates of alcoholism among American Indians (about six times the national average in the United States, last time I checked)

    This has 8.0% of American Indians over 18 with reported heavy alcohol use in the past month, as compared with 7.2% of Whites. And that does not go further into factors such as unemployment and isolation. I think the case for a genetic factor is weak.

  440. Dmitry Pruss says

    There is a lot of past talk about “different alcohol genetics” but it’s a substantially misconstrued legacy of a pre-genomic era. Decades ago, it was already commonplace knowledge that Asians and Native Americans tend to have a less productive version of alcohol dehydrogenase (ADH), an enzyme which removes alcohol from circulation. And it’s heritable. So in the ancient age of the human genetics ADH has become the poster child of the notion that there are biochemical differences between “races”. Some people made far reaching superiority arguments out of this observation, some tried to make far-fetched “ethnic genetic weapons” based on it (in the times of USSR’s Sinophobia, Soviet military biologists ran some pilot ADH weaponization projects using Yakuts in lieu of Chinese), and most people just kind of assumed that, aha, this must be why Native Americans were so susceptible to alcoholism. But the ADH link is more complicated. It matters a lot for the rate of body’s natural alcohol detox, true … but the genetic component of alcohol dependence is comprised of great many other factors. Incidentally a study in the UK Biobank showed that genetic predisposition to alcohol dependence doesn’t change from generation to generation, contrary to the old eugenicists’ wild guess that the drunkards out-proliferate them…

  441. Trond Engen says

    There’s a lot of contradictory numbers out there, not at least because the problem is hard to define and quantify. It’s also a fraught issue in a contemporary setting, for good reasons. But perhaps it’s easier to handle historically, and as just one set of many genetic variables correlated with health.

  442. David Marjanović says

    Also, the product that alcohol dehydrogenase makes is a toxic aldehyde; if you don’t make enough of the next enzyme in the chain, aldehyde dehydrogenase, you have a problem…

    The version of the story I heard was actually that East Asians’ alcohol dehydrogenase is so fast that the aldehyde dehydrogenase can’t keep up.

  443. Trond Engen says

    Dmitry: But the ADH link is more complicated. It matters a lot for the rate of body’s natural alcohol detox, true … but the genetic component of alcohol dependence is comprised of great many other factors.

    No doubt. I was careful not to name ADH. I saw a study not long ago that identified a number of genes associated with alcohol consumption, most of them probably subtly regulatory in complex interaction, if I remember correctly. It’s not at all clear that evolution would have had enough time to work on that, or if there were any benefits in changing it. But all the more interesting to see if anything did happen in our recent genetic history

    Incidentally a study in the UK Biobank showed that genetic predisposition to alcohol dependence doesn’t change from generation to generation, contrary to the old eugenicists’ wild guess that the drunkards out-proliferate them…

    So evolutionary neutral, then?

  444. Trond Engen says

    To clarify: My reason to mention alcohol in the context of marginalized hunter-gatherers was its compounding effect on social factors related to marginalization, but I realized that this was probably less of an issue before the invention of the still, and that what I’m actually interested in is if there’s been genetic adaptation in societies with regular access to alcohol.

  445. the problem is hard to define and quantify
    plenty of good recent papers did quantify it with strong results. The problem is with cultural confounding rather than with quantification per se. People of different ethnic / regional / religious backgrounds can drink more or drink less for social and cultural reasons, and it can be misconstrued as a genetic effect.
    For example, in a classic 10 years old study, the “Asian / Native American” ADH variant was found to be strongly protective against alcoholism (some 3-fold difference in odds) in White and Black Americans, but one might have argued that it’s because we are observing effects of an Asian admixture which is also cultural, and causes people to drink less for non-genetic reasons.
    Luckily it’s also known to reduce the odds of alcohol abuse by about 4-fold in the Asian populations as well, in multiple studies, so in this case one comes to the conclusion that the effect is indeed biological rather than cultural.
    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3252425/
    Anyway, in the hindsight of the genomic era it’s pretty clear that the old-age tale of “Asian alcohol genetics” got it backwards.

    Did you see my actual ancient DNA post, BTW, Trond? It’s easy to get it lost in an off-track discussion

  446. Corded Ware problem fixed!

  447. Trond Engen says

    @Dmitry: Thanks, I’ve seen it! I just wanted to get to my computer before following the link. It’s hard enough reading Russian through automated translation if I shouldn’t also do it on a small screen.

    Before trying to read the paper I wonder:
    – if the multi-ethnic joint venture operation that was (early) Sintashta started already here.
    – if there’s operational continuity between Fatyonovo and the western branch of Seima-Turbino.

  448. Trond Engen says

    Still not really reading, but:

    This study addresses a fundamental question of the origins and migration patterns of paleopopulations of the Fatyanovo and the Middle Volga Abashevo archaeological cultures. It is for the first time that we report a paleogenetic analysis of 14 Abashevo individuals (Pepkino and Starshy Nikitinsky sites). Besides, we analysed ancient DNA samples of 25 Fatyanovo individuals.

    Specifically, we performed analyses of STR marker and haplogroups of the Y chromosome, which revealed the distinct R1a (Z93) haplogroup in Fatyanovo samples. It indicates the influence of the founder effect and gene drift, confirming the hypothesis of their migrant origin. In contrast, the Abashevo culture samples are heterogenous, as we discovered 2 groups with different origins on the paternal line. To be more specific, three men from Pepkino mound are haplogroup R1b (Z2103) carriers, while seven other individuals have haplogroup R1a (Z93>Z94). In addition, close relatives with identical STR haplotypes of the Y-chromosome were identified in both Fatyanovo and Abashevo groups.

    The comparative analyses of autosomal markers from 19 samples and previously published data uncovered similarities between Abashevo men from Pepkino mound (the haplogroup R1a (Z93>Z94)) with the Fatyanovo people, as well as with some representatives of the Unetice culture. These results are suggestive of the genetic continuity in the Russian Plain. Yet, less recent ancestors of Abashevo group interred in Pepkino mound could have migrated from the same region as the Fatyanovo predesessors.

    Únětice? Maybe I should have known, but this blew my mind. Carpathian bronzeworkers trekking east towards the mines of Ural? The cultures were contemporary. Showing that individual people must have been much more mobile than we often imagine. Should we think that a transcontinental clan system (or some such) was in operation, granting passage and accomodation?

    Also: This from 2020 on the Fatyanovo and related cultures. Untangling the multiethnic makeup of Sintashta and the Seima-Turbino system is important for understanding the forming phases of Indo-Iranian, Uralic and maybe Turkic. Sintashta may have inherited the multiethnic profile from Abashevo, and Abashevo may well have it from Fatyanovo and Volosovo.

    Also also: The fast route between Abashevo and Northern and Western Europe might have been Upper Volga and the Baltic Sea.

  449. Dmitry Pruss says

    Únětice
    That’s Fig. 4 and it’s kind of the other way around, if you trust the analysis (and you shouldn’t).

    In their so-called autosomal data, two Únětice individuals appear to be similar to the main Fatyanovo and Abashevo cluster, as if travelled back West. One has to realize that the autosomal data in this analysis is incredibly limited (100 loci selected for a chance to make guesses about the ancients’ appearance) and therefore subject to high noise. Not sure how much to trust it. It’s my never stopping gripe, that researchers in the less technologically advanced countries and centers waste ancient DNA in cheaper, but limited utility applications. One actually needs less DNA to read the whole genome than to read these 100 locations!

    To further underscore how poor is their autosomal DNA analysis, they show on the same picture that two of their Fatyanovo individuals, and one from Saag’s earlier publication, landed in a “cluster” made of completely assorted peoples of different ages and areas.

  450. John Cowan says

    It seems that modern examples of replacements with no genetic legacy are very hard to find.

    Which suggests to me that if guns, germs, and gas chambers[*] can’t do it, it can’t be done at all.

    [*] Amazingly, some 35% of Canadian Jews (about 330,000 people, 1.5% of all Canadians) are Holocaust survivors or their descendants. The comparable figure for the U.S. is 8%, out of about 8 million Jews, 2.5% of the U.S. population.

  451. The genomic history of the indigenous people of the Canary Islands may be of interest. Looks like overwhelmingly Berber Y-haplogroups. The islands divide neatly into two clusters, “with all western islands (El Hierro, La Palma, La Gomera, and Tenerife) placed closer to Upper Paleolithic and Early Neolithic North Africans, and all eastern islands (Gran Canaria, Fuerteventura, and Lanzarote) cluster closer to ancient and modern Europeans”.

    Data from the CIP [Canary indigenous people], who colonized the archipelago around the first centuries CE, indicate that this North African population was composed of four main ancestral components. First, as observed for Late Neolithic Moroccans5, the Canarian indigenous people have both North African Paleolithic/Early Neolithic and European Early Neolithic components. However, the contribution of the North African Paleolithic/Early Neolithic component is greater in the indigenous people than in the Kef El Baroud individuals, confirming that the impact of European Neolithic migrations was not homogenous in the North African region. In addition, the indigenous people show the presence of a steppe component, most probably associated with the migration of North Mediterranean populations into North Africa during the Bronze or the Iron Ages. Finally, we detect a small sub-Saharan African component implying the existence of trans-Saharan migrations in North Africa already before the first centuries CE, predating such gene flow inferred from modern DNA data6,42.

  452. Trond Engen says

    That sounds like two distinct migrations. The first colonized the whole Canary, the second only the eastern islands. Should the substrate hunters look for two distinct substrates too?

  453. PlasticPaddy says

    @Lameen
    Re sub-Saharan component, at the risk of seeming politically incorrect, are we talking about trade goods (i.e., slaves) rather than “trans-Saharan migrations”?

  454. David Eddyshaw says

    I am delighted to discover that my son therefore seems to have married into ancient Berber stock.

  455. Trond: It would be interesting to re-examine the scanty attestations of Guanche and see if they can better be accounted for in terms of two languages rather than one.

    PlasticPaddy: It’s possible – at a rather more easterly location in modern Libya, Herodotus attests to slave raids southwards, and maybe that was already happening in Mauritania. But it can’t be taken for granted. Perhaps this reflects small preexisting Saharan populations (“Melanogaetulians”?), or occasional arrivals by pirogue from the mouth of the Senegal, or something else. In historic times, the Guanches were regular targets for European slave raids, but not as far as I know significant purchasers.

    David: A clear sign of his good taste. Wishing you many happy Macaronesian grandchildren!

  456. Quoth Lameen:

    As for Guanche, as Galand [here] reminded us recently, we don’t even know for sure that it is Berber; all we know is that it contains a number of obviously Berber words, alongside a number of very basic words that look nothing like Berber. That could reflect common ancestry, or it could reflect later contact.

  457. Trond Engen says

    I thought I remembered reading something like that.

  458. Another possible paleogenomic hint at the Indo-Iranian origins from the analysis of a 3-generation burial of a post-Sintashta Alakul Culture clan burial complex at kurgan-1 of Nepluyevsky necropolis, just 50 km N of Sintashta but a few centuries younger. These people are linked by distant relationships with the ancient graves from Altay and Tian Shan, The core of the patrilocal / exogamous family was a group of brothers sharing, surprisingly, Y-haplogroup Q-L275 (Q1b), of Siberian origins and present at lower levels in today’s Iranians and Indians (and in Iron Age Xinjiang), although 2 very distant male relatives with the more classic Y-haplogroup R1a were also buried near the top of their mound. One of the two wives of the clan’s leader also had substantial local Siberian / Central Asian ancestry, although her mtDNA was purely Corded Ware-like. The clan’s stopover at this location was short-lived, at most a few decades before they moved on.

    So here we have population mixing at the forest-steppe interface of Western Siberia, and autosomal and Y-DNA links to the locations much further East and South in the centuries that followed.

    https://www.pnas.org/doi/10.1073/pnas.2303574120

  459. Trond Engen says

    Oh, interesting! Is this intermarriage between ethnic groups in the Sintashta alliance, or could it be long-distance alliance-building?

  460. Dmitry Pruss says

    There is no Strontium isotope mobility analysis in the Neplyuevka paper. But it’s dated to Late Alakul when the distant migrations across the Eastern Steppe were already the fiat acompli.

  461. John Cowan says

    I ran into yet another unfortunate rake today, attributed to Doris, Dee, Rose, and Bonnie Beetem:

    As I walked along on the sands of Arrakis
    As I walked along on Arrakis at noon,
    I saw a young Fremen all dressed in a stillsuit
    A-ridin’ a sandworm ‘way out on a dune.

    I see by your stillsuit that you are a Fremen,
    You see by my stillsuit that I’m a Fremen too.
    We see by our stillsuits that we are both Fremen.
    Climb down from your worm, I’ll share water with you.

    He was a Fedaykin, a Fremen commander.
    He was a Fedaykin, though only a lad.
    We’ll crush the cruel Baron and win for Atreides.
    The green and black banner will rule the jihad.

    I see in a vision this young Fremen dying.
    I see that he dies in a Sietch Tabr cave.
    He gives me his life and the tribe gets his water;
    Thirty-three liters–the rest to the grave.

  462. The second line surely ought to be: “Walked without rhythm on Arrakis at noon.”

  463. Some comments on Heggarty et al. by Andrew Garrett, a co-author of Chang et al., here. He makes several points which didn’t come up in this discussion.

    He also refers to a paper by Goldstein et al., Divergence-time estimation in Indo-European: The case of Latin, which uses similar methods to look at the diversification of Romance. Here the ground truth is much better understood.

  464. Trond Engen says

    Dmitry: There is no Strontium isotope mobility analysis in the Neplyuevka paper. But it’s dated to Late Alakul when the distant migrations across the Eastern Steppe were already the fiat acompli.

    True, but intermarriage within the multi-ethnic Sintashta communities would be expected, so it’s interesting if for some reason it didn’t happen. It’s also interesting if Andronovo clans established long-distance relationships before their expansion.

  465. Trond Engen says

    @Y: Thanks!

    “I’ll maintain that what’s going on now is development of methods with IE as a test case. It’s just too bad that it has to be framed as solutions to IE problems.”

  466. I guess this is the best place to put this:
    New Indo-European language discovered.
    Well, it’s a new minor Anatolian language from the Boghazköy archives, so nothing too surprising, but I guess it’s worth notifying the Hattery.

  467. As I just wrote to Hat in an email:

    It seems that the Hittites collected the rituals (or centralized the cults) of the conquered peoples making up the empire. If so, more languages can surely be expected to come.

    The news about this specific language is a little premature, maybe, since all they can say so far is that it was spoken in a region called Kalasma, that it’s Anatolian, and that within Anatolian it seems to share features with Luwian.

  468. Hans commented as I was putting together a post on it, which is now up.

  469. Thanks. I took my comment over there. Might be fun trying to discuss a subject in the designated thread for a change.

  470. David Marjanović says

    It seems that the Hittites collected the rituals (or centralized the cults) of the conquered peoples making up the empire.

    Like the Romans, the Hittites believed that all deities anyone worshipped were real, so they had to worship them too or risk pissing them off.

    Unlike the Romans, the Hittites believed the worship had to be done in the original language. Quite a few languages are known today exclusively from prayers in a Hittite context.

  471. Like the Romans, the Hittites believed that all deities anyone worshipped were real, so they had to worship them too or risk pissing them off.

    The explanation I heard (from my middle school history teacher) was that the Pantheon, in particular, was meant to acknowledge and draw a wide spectrum of worshippers, but in centralizing them, make them subject to the same political power.

  472. BTW comments which don’t reflect the values of this community are kind of like eyesores…

    Indeed. I apologize and withdraw the remark.

    What is this yok of which you speak?

    I’m still wondering.

    JC: So perhaps it may be the case that our presently split population will never materialize into an actual division between merged and unmerged speakers?

    DM: Yes, but that is astronomically unlikely because it would require an improbably low rate of evolution along both of these branches.

    To restate: Linguist A believes that the population has split between KIT-FLEECE mergers and KIT-FLEECE non-mergers and this will be detectable a century from now. Linguist B says there is no evidence for any such split. Linguist A agrees that there is no evidence, but says the absence of a split would imply something improbable happened along both branches.

    What stops Linguist B from saying that there is no reason to believe that there are two branches at all, given the lack of empirical evidence as of now?

    (Theologian A says that it is wildly unlikely that every single person will be saved from Hell, and therefore universalism is almost certainly not true. Theologian B says there is no empirical evidence that Hell exists at all. How is Theologian A to refute this?)

  473. January First-of-May says

    What is this yok of which you speak?

    Presumably this: https://en.wiktionary.org/wiki/yok#Turkish

    (or probably rather some of its Central Asian cognates, but AFAICT those are usually spelled differently)

  474. How is Theologian A to refute this?

    By reminding B that theology is not linguistics.

  475. David Marjanović says

    So perhaps it may be the case that our presently split population will never materialize into an actual division between merged and unmerged speakers?

    DM: Yes, but that is astronomically unlikely because it would require an improbably low rate of evolution along both of these branches.

    […] What stops Linguist B from saying that there is no reason to believe that there are two branches at all, given the lack of empirical evidence as of now?

    Here’s the context again:

    The last 2 of the 3 paragraphs of 5.3 (“Estimating Chronology”) read like an angry reply to a reviewer:

    The chronological estimation is based on these two parameters c and σ, and the set of known date calibrations of all languages in the data set (§4). Note that this approach has nothing to do with the early and now discredited technique of ‘glottochronology’ in linguistics (75). That was founded on an assumed constant rate r of change/retention in cognacy. In the relaxed clock model used here, there is nothing — no parameter setting — that corresponds to any ‘glottochronological constant’, certainly not the lineage ‘birth rate’ and ‘death rate’ parameters mentioned in §5.2 above. Those refer to the birth and death of lineages, i.e. splits and extinctions respectively, in the phylogeny. They do not refer to changes or ‘mutations’ in the state of a data character switching between being absent (0) and present (1) along branches in the tree.

    This distinction is best understood also by clarifying the relationship between branching events in the phylogeny, and change events in language data. Strictly, branching events occur independently of, and are not defined by, actual changes in the language data. Once a split has arisen, however, the conditions are set in which different changes can then arise along the different branches. Similarly in actual language histories, a speaker population may split into two (by some long-distance migration, for example), but that need not automatically create language changes. It does, however, establish newly separated speaker populations whose language lineages may thereafter undergo different changes, and thus progressively diverge from each other.

    So you’re looking at this from the wrong side. None of this is meant to predict the future from the present. This part of the model simulates tree growth: Every once in a while, a split happens, or a branch stops growing. Given various parameters, classes trees of different shapes and sizes will result.

    (I can’t find this on Wikipedia. Google Scholar finds plenty for fossilized birth-death model, though.)

    Some of these can’t fit the evolution of the data: if a character state changed along a branch, the branch had to exist first, so any tree that has that branch appearing too late is out. Splits must happen at the same time as or earlier than any changes to character states in the resulting branches.

  476. Similarly in actual language histories, a speaker population may split into two (by some long-distance migration, for example), but that need not automatically create language changes.

    Ah, so this is the key to it: split means a division of the population, and not (ut est eorum mos) a division of the language they speak. Now it makes sense.

  477. A new paper, Genome-wide variation in the Angolan Namib Desert reveals unique pre-Bantu ancestry, by Oliveira, Fehn, Amorim, Stoneking, and Rocha, also discussed at phys.org. The genetic news are that there is a very old genetic component which shows up in people of the Angolan Namib Desert, distinct from that of other Khoisan speakers (but apparently showing traces in Sandawe-speaking and other East African populations). The linguistic news are that there are still two speakers of Kwadi, thought to have been extinct since the 1960s.

  478. David Eddyshaw says

    Güldemann (IIRC) reckons that Khoe-Kwadi arrived in Southern Africa comparatively recently, and is possibly related to Sandawe, but not to the Tuu or Kxʼa families.

    https://en.wikipedia.org/wiki/Khoe_languages#History

  479. Forgive the elementary question, but where does the stress go in Sandawe?

  480. ” two speakers of Kwadi”

    Wow.

  481. LH: Sandawe is tonal, so Sàndàwé.

  482. Güldemann (IIRC) reckons that Khoe-Kwadi arrived in Southern Africa comparatively recently

    I was looking into the distribution of Y-haplotype A the other day (Y-haplotypes are transmitted through the male line only). The only populations in the world where it seems to account for a majority are the Dinka and Shilluk up in South Sudan, on the one hand, and the Nama and Tsumkwe San in Namibia, on the other. I have to guess that it must have had a reasonably high frequency among the first cattle herders in eastern Africa…

    ” two speakers of Kwadi”

    I second drasvi’s “wow”. Hope someone’s working with them!

  483. The linguist among the authors, Anne-Maria Fehn, appears to have done a great deal of fieldwork in the area, and on Khoe-Kwadi languages in particular. Surely she is working with them. “Despite the Kwepe’s recent shift to the Bantu language Kuvale, we identified the descendants of the Kwadi speakers recorded in 1965 by the linguist Ernst Westphal and found two women who still remembered a considerable amount of the now extinct language’s lexicon and grammar.”

  484. LH: Sandawe is tonal, so Sàndàwé.

    Thanks! I guess my question then becomes “How do people generally say it in English”; I just thought to check the OED, which turns out to include it as an entry and says /sanˈdɑːweɪ/.

  485. David Marjanović says

    split means a division of the population

    In effect, yes.

    Güldemann (IIRC) reckons that Khoe-Kwadi arrived in Southern Africa comparatively recently, and is possibly related to Sandawe, but not to the Tuu or Kxʼa families.

    Starostin, too, opposes Central Khoisan (Khoe-Kwadi) to Peripheral Khoisan (Tuu/South + Kx’a/North).

    Wow.

    Thirded, even if they’re “last hearers” more than “last speakers”.

    I was looking into the distribution of Y-haplotype A the other day

    As the article makes clear, Haplogroup A is not a clade – it’s paraphyletic with respect to all other Y-haplogroups (a clade collectively called BT).

  486. As the article makes clear, Haplogroup A is not a clade

    Thanks – I should have looked more closely: “The subclade A1b1b2b (M13; formerly A3b2) is primarily distributed among Nilotic populations in East Africa and northern Cameroon. It is different from the A subclades that are found in the Khoisan samples and only remotely related to them”. (Though nothing cited in the article seems to support “northern Cameroon”.)

  487. “A1b1b2b (A-M13)
    The subclade A1b1b2b (M13; formerly A3b2)”

    This is absolutely confusing.

    Anyway, A1b1b2 and A1b1 seem to be found among both Dinkas and Khoisan :

    Haplogroup A3b2-M13 is common among the Southern Sudanese (53%),[22] especially the Dinka Sudanese (61.5%).[30] Haplogroup A3b2-M13 also has been observed in another sample of a South Sudanese population at a frequency of 45% (18/40), including 1/40 A3b2a-M171….

    ….

    A1b1a1a (A-M6)
    The subclade A1b1a1a (M6; formerly A2 and A1b1a1a-M6) is typically found among Khoisan peoples. The authors of one study have reported finding haplogroup A-M6(xA-P28) in 28% (8/29) of a sample of Tsumkwe San…

    A1b1b2a (A-M51)
    The subclade A1b1b2a (M51; formerly A3b1) occurs most frequently among Khoisan peoples (6/11 = 55% Nama,[20] 11/39 = 28% Khoisan,[23] 7/32 = 22% !Kung/Sekele,[20] 6/29 = 21% Tsumkwe San,[20] 1/18 = 6% Dama[20]). …

  488. David Marjanović says

    This is absolutely confusing.

    Oh yes. I’ve often wondered how Y-haplogroup nomenclature works – though never quite enough to look it up.

  489. Wow indeed!

    Anne-Maria Fehn

    I recognized the name and thought that this must tie into another paper on the Namib desert we discussed recently, but it turns out it’s the same one. Discussed here.

  490. I think you mean: Discussed here.

  491. Yes, thanks..

  492. “I’ve often wondered how Y-haplogroup nomenclature works – though never quite enough to look it up.”

    Names like “a1b1..” look simple, I suppose they represent branching in a binary tree.
    Then a3b2 also makes sense.
    I suppose they did not know who of then a1, a2, a3 is a child of whom.

    But as result the same group is referred to by multiple names:-( I see two ways to sort it out: assign unique names to mutations, or to add a date to a3b2…

    “The subclade A1b1b2b (M13; formerly A3b2) is primarily distributed among Nilotic populations in East Africa and northern Cameroon. It is different from the A subclades that are found in the Khoisan samples and only remotely related to them”.

    The closest relative of Dinka
    A1b1b2b must be Khoisan
    A1b1b2a
    The question is how many kya. But it does imply a migration.
    And this migration can be arbitrarily more recent than these two groups (what if they initially populated two valleys?).

    Hm. It just occured to me that one of two must always be paraphyletic…

  493. I don’t get that last point.

  494. I discussed the Namibia genetics/ linguistics paper when it first appeared as a preprint in winter
    https://www.biorxiv.org/content/10.1101/2023.02.16.528838v1
    My discussion is in Russian which you may dare to auto-translate with the network’s own stupidest translation
    https://www.facebook.com/dmitry.pruss/posts/pfbid0j4sY57R84wAAm5aNx93rK46BL7qjLWXoQ4DQXyNKJfbRWKTmTuZTW3oXhh7QqNzLl
    but the huge surprise about the rediscovery of Kwadi was shared in my February post, too.

    As to the Y-haplotype nomenclature, one may need to remember that DNA sequencing used to be very expensive, and whenever possible, the researchers used variation in DNA fragment lengths as a cheap proxy for the variation in their nucleotide content (sequence). In those bygone years, STRs (short tandem repeats, where the same string of 3, 4, or sometimes 2 letters is repeated within a DNA fragment and the number of such repeats, and therefore the length of the fragment, differs from person to person). Actually, although STRs long fell out of use in research, law enforcement still uses this ancient tool.

    Anyway an STR marker doesn’t come in just 2 or 3 versions. There may be dozens of distinct repeat numbers (and sizes). Because of such abundance of sizes, in those days, Y-chromosome researchers would sometimes discover, and name, multiple forks in their phylogenetic trees. To make the matters worse, the STR marker sizes change a little from generation to generation, making the naming-by-STR-length science inherently inexact. As new genotypes poured in, the haplotype definitions were from time to time reconsidered.

    In modern technological settings, sequencing rules, and Y-haplotype phylogenies nowadays generally come only with bifurcations. The multiple-branch past tree concepts are gone. The contested changes in the tree’s very shape are gone too, because in comparison to the shape-shifting STRs, single-nucleotide sequence variations are pretty much immutable. But the conflicting naming systems are still with us, largely because there is a huge commercial dimension in Y-genotyping, not just research dimension.

    The best free site to try to make sense of Y-DNA diversity is probably yfull. For example, at https://www.yfull.com/tree/A-M13/ you can immediately grasp that A-M13 differs from nearby branches by whopping 367 DNA changes, and that it took about 30,000 years to accumulate all these changes.

  495. Dmitry, any idea why they didn’t include the usual tree diagram, with divergence time and admixture estimates?

  496. I think that their goal is to display all contemporary and ancient samples which went into each tree branch model, and it’s probably too rich to display graphically (and wouldn’t be searchable to boot). For example, A-M13 branch is based on 141 samples. Most of them are from Arab countries, because males there volunteer to share expensive full sequences at a greater rate than Sub-Saharan Africans. But Kenya is also represented (as well as Sudan and Chad).

    Admixture is not an issue with Y which doesn’t recombine. Divergence times are given as branch-emergence dates (in this case between 41 and 48 thousand years). There is also a smaller TMRCA estimate, which is when all known sub-branches converge.

    If you want to move up a notch or a few, just click on the tabs above the tree. There are 6 of them, corresponding to 6 bifurcations known earlier in history. Say, going one branchpoint deeper to https://www.yfull.com/tree/A-YP4735/ gives you emergence ~55,000 years ago and the first known split ~44,000 years ago. Sister branches to A-M13 are also mostly attested from the Arabic countries, but also from Ethiopia. And so on…

    Say, continuing deeper to https://www.yfull.com/tree/A-M32/ (age 125,000 years) you start seeing South African and Namibian samples.

  497. Two ancient DNA preprints on the genesis of Uralic (and Yeniseian) languages, and my and Trond’s long-time favorites, the itinerant bronze-makers of Seima-Turbino and Ymyyakhkakh cultures (the latter simplified in spelling as Ymyakhtakh, and then simplified even further as Yakutia Late Neolithic / Bronze Age aka Yakutia_LNBA)
    https://www.biorxiv.org/content/10.1101/2023.10.01.560195v1.full.pdf
    https://www.biorxiv.org/content/10.1101/2023.10.01.560332v1.full.pdf

    Ymyyakhkakh is born in a sex-based displacement/admixture of Yakutia’s local Syalakh-Belkachi peoples and intrusive populations from Transbaikalia. The mixture is about 50:50. The timing is about 4.5 thousand years ago. The Transbaikalia males contribute the N-haplogroup Y-chromosomes which will latter serve as a marker of the Uralic peoples. The autosomal DNA of the Yakutia_LNBA (Ymyyakhtakh) mixed population turns out to be another marker of the genesis and spread of the Uralic peoples (who turn out to have received their Eastern genetic components solely from Ymyyakhtakh).

    On the other, Western side of Lake Baikal, about 5.1 thousand years ago, a different admixed population takes place, its DNA comprised of ancient Paleo-Siberian and Inland North Asian components. Y-haplotype Q-YP1691 is widespread among them, and both this Y-chromosome and, especially, autosomal DNA serve as a marker of the past spread of the Yeniseian languages (which used to be located further up the Yenisei river than their remnant, the Ket). Today they are mostly dissolved in the South Siberian Turkic peoples who moved in, and assimilated, the Yeniseian peoples (but some of their DNA is found further North due to Samoyedic interaction and to the Yakut and Dolgan’s migrations far out of Southern Siberia).

    Both preprints concur that Seima-Turbino was a multiethnic / multicultural corporation (but we were certain about it from the archaeological data already, since they ST combine Sintashta-derived bronze metallurgy with Forest Siberian bone / horn and pottery technologies). But we didn’t know if the “Steppe” and “Taiga” segments of the ST existed alongside with familial mixing / were fused together into one family / remained an inequal agglomeration of masters and slaves… Now it looks like the ST of Rostovka and several nearby sites were mostly of recently three-way mixed stock (Ymyyakhtakh + Sintashta + local taiga hunter-gatherers), extremely diverse, albeit a few of them were “pure Siberians” and “pure Sintashta”. Further East and closer to Yenisei, an ST burial of Tatarka Hill yielded only genetically Yakutia_LNBA individuals. At the Western-most site at Satyga-16 (Middle Ural Mountans), the mixing is the most uniform. The particular subgroup of Yakutia_LNBA populations present at Tatarka Hill, and getting admixed at Rostovka and further West, is specifically enriched in the Uralic Y-haplogroup N-L1026 > Z1936. Both preprints assume, therefore, that Seima-Turbina was a link between Yakutia and the later Uralic populations (and further speculate that proto-Iranian influences in Uralic were mediated not so much by the Sintashta contact zone, but more by the shared enterprise of Seima-Turbino)

    In addition to the aforementioned 3-way admixture, a couple Seima-Turbino individuals came from the presumed proto-Yeniseian Cisbaikalic population, and one from as far West as the Baltics.

  498. David Marjanović says

    Slowly the fog clears…

  499. Trond Engen says

    I just knew the Ymyyakhtakh were up to something, but I didn’t suspect this.

    A few quick notes as a start:

    The Cis-Balkaians are a surprise to me. Not as ancestors of the Yeniseians, especially, but as close kins of the ancestors of Ancestral North Americans. The probable homeland of American language families turning up west of the homeland of Uralic is bonkers.

    We already knew that Ymyyakhtakh spread north and then widely both west and east through Northern Eurasia. What I think they show here is that one group, let’s call it the southwestern branch, came in touch with Sintashta and local HG peoples and decided to join forces. This soon spread almost continent-wide in the forest/forest steppe belt. The authors wonders about the cultural mechanisms behind that, and so do I.

    I don’t think the Kola Peninsula specimens are Seima-Turbino. I suspect that their Ymyyakhtakh genetic heritage (which we already suspected) came through the northern route.

  500. David Marjanović says

    The probable homeland of American language families turning up west of the homeland of Uralic is bonkers.

    It does make it easier to connect Uralic and Yukagir in some way… and Yeniseian with Burushaski or, uh, Hattic.

  501. Dmitry Pruss says

    The Cis-Balkalians are a surprise to me. Not as ancestors of the Yeniseians, especially, but as close kins of the ancestors of Ancestral North Americans. The probable homeland of American language families turning up west of the homeland of Uralic is bonkers.

    It isn’t so much about the homeland of American peoples, but about the similar principal ancestral streams which gave rise to both the Cisbaikalians and some early Americans. But the sources are much further East. What they call “Inland NE Asians” is exemplified by Yumin (8.4 kya from Inner Mongolia, only about 400 km from Yellow Sea). I assume that some Yumin-like peoples traveled West and others, North-East.
    There is a set of useful maps at page 39 of the PDF.

    I don’t think the Kola Peninsula specimens are Seima-Turbino

    Most definitely not. The BOO sounds Halloween like but just stands for Bol’shoy Oleniy Ostrov, Big Reindeer Island, and the first DNA samples from there have been already know for years. In some superficial way they look similar to Seima-Turbino, being an East-West mix of somewhat similar components, but there are clear differences. For one thing, BOO lack the European Neolithic farmer ancestry, and, therefore, can’t be descended from Abashevo / Sintashta. They are also rather uniformly mixed, indicating an earlier population fusion.

    one group, let’s call it the southwestern branch, came in touch with Sintashta and local HG peoples and decided to join forces

    exactly. And culturally mysterious it may seem, but it wasn’t as innovative as it looks. Similar professional amalgamations of different tribes were already a thing at the Abashevo and Sintashta stages. The classic Volga riverbank burial of the Abashevo war dead clearly tells the same “technological joint venture” story some centuries earlier.

  502. Trond Engen says

    @Dmitry: I got short on time, and the quick notes got too quick. Thanks for straightening it out.

    It isn’t so much about the homeland of American peoples, but about the similar principal ancestral streams which gave rise to both the Cisbaikalians and some early Americans. But the sources are much further East. What they call “Inland NE Asians” is exemplified by Yumin (8.4 kya from Inner Mongolia, only about 400 km from Yellow Sea). I assume that some Yumin-like peoples traveled West and others, North-East.
    There is a set of useful maps at page 39 of the PDF.

    I obviously hadn’t read the discussion of the Cisbaikalians well enough. I read it as suspecting a close but unsampled population would be a good fit. The Yumin people could probably be the very same that went north and admixed in the paper on ancient admixture in North Asian populations. I’ll have to dig up that again.

    I agree that the map is good. The illustrations are excellent all over. If I had a wall above my desk, the chart of cultures and technological phases at page 41 would be there. It’s not beautiful, but it’s very useful.

    The BOO sounds Halloween like but just stands for Bol’shoy Oleniy Ostrov, Big Reindeer Island, and the first DNA samples from there have been already know for years. In some superficial way they look similar to Seima-Turbino, being an East-West mix of somewhat similar components, but there are clear differences. For one thing, BOO lack the European Neolithic farmer ancestry, and, therefore, can’t be descended from Abashevo / Sintashta. They are also rather uniformly mixed, indicating an earlier population fusion.

    Yes, that’s what I meant to say.

    And culturally mysterious it may seem, but it wasn’t as innovative as it looks. Similar professional amalgamations of different tribes were already a thing at the Abashevo and Sintashta stages. The classic Volga riverbank burial of the Abashevo war dead clearly tells the same “technological joint venture” story some centuries earlier.

    Exactly. There was an era of several centuries when a long belt in the middle of the continent was dominated by apparently internally peaceful, multi-ethnic, amalgamating communities of local residents and long-distance travelers from different cultures. That’s very different from the population turnover seen on the western edge.

    Or is that really how we should see the emergence of Corded Ware as well? The ethnogenesis of the Yamnaya somewhere near the lower Volga (which must have happened just after the Samara people started to make copper)?

  503. Dmitry Pruss says

    “Internally peaceful” is a good choice of words given that much of the technological power of these groups was intended for warfare. But I always wondered if some of the Seima-Turbino craftsmen might have been slaves or slave-like captives. Now it appears increasingly likely that they were all family, from how the gene pools mixed and the divergent paternal and maternal DNA lines remained. Maybe that’s the explanation for the short-lived nature of the ST phenomenon, some sort of a problem of lack of “blood cohesion” / “blood affiliation with a larger tribal structure”??

    Two more observations on economic niches.

    Cisbaikalian_LNBA Serovo culture and related Angara river peoples left behind numerous petroglyphs of fish, perhaps indicating continuity with the present-day Ket whose livelihood was still based on fishing below the enormous Siberian rapids.

    Yakutia_LNBA Ymyyakhtakh are reported to have produced tin bronzes, and perhaps it was their access to tin which gave them the competitive edge early on. Rich cassiterite deposits are out there in Eastern Yakutia, in Yana and Indigirka Basins. I wonder if their bronzes have been analyzed by metal admixture and isotope content to pinpoint the origin of their materials…

  504. Trond Engen says

    Access to tin trade seems like an obvious advantage.

    I’m thinking about the system for discovering metal resources and making them available. There are special skills involved: Discovery of resources, extraction technology, metalworking distribution. One thing is random discoveries, but the Eurasian continent was pretty much sieved through in a few centuries. I wonder how much of this was planned efforts involving visiting specialists or young people traveling out to learn. Maybe an era of traveling metal specialists established systems of hospitality and cooperation that could be extended to multiethnic societies of metal workers.

  505. Are there any medieval professional societies that cross language borders?

  506. David Eddyshaw says

    “The Barber-Surgeons of Dublin was the first medical corporation in Ireland or Britain, having been incorporated in 1446 (by Royal Decree of Henry VI).”

    https://en.wikipedia.org/wiki/Royal_College_of_Surgeons_of_Edinburgh

    Prior to the end of the Roman Empire in 1453, so mediaeval. Got in under the wire.

  507. Well, I think by “society” I mean anything from an actual organisation with hierarchy and everything to itenerant craftsmen (unless we treat the latter as an ethnicity or caste – say, based on endogamy).

    As for language borders, the most extreme case would be something like Jewry and other communities with diasporas, just on professional rather than ethnic (I don’t know how to apply religion here – what if these professionals have a dedicated god in polytheistic society?) basis.

  508. Dmitry Pruss says

    I wonder how much of this was planned efforts involving visiting specialists or young people traveling out to learn
    I only read reports of a much later era, from the early days of Russian colonial exploitation of today’s Yakutia in the 1600s in the transcripts of surviving Siberian Prikaz paperwork. With the sable nearing extinction, bands of Cossacks roamed the rivers and tributaries in search of something, anything, to plunder, including escapees from the amanat system (whole tribal bands took off to the wilderness when it became impossible for them to pay yasak tribute), walrus tusks and any reports of unusual veins of rock. The Yakuts were better organized and quickly realized that by delivering fabulous gifts to the Czar himself, they can arrange better terms for their yasak, like using second-rate quality furs on lieu of sables. In contrast, many tribes around them were driven to starvation and extinction by impossible tributes. But the point is, the roaming Cossacks didn’t know much but reported everything unusual they saw or heard about minerals. Only then, the more knowledgeable people would sometimes investigate. The idea wasn’t to go there to exploit some natural riches. It was to take hostages from the locals so their kinsmen pay yasak in whatever riches were there. Perhaps a similar economic model could have been in use in the Bronze age already? Use superior arms to defeat faraway locals and make them pay tribute in necessary raw materials…

    The archeologists who uncovered Allakakh copper smelters and bronze metallurgy camps of the 1200s BC (a Northwestern-most offshoot of Ymyyakhkath at the outskirts of today’s Norilsk) assumed that the artisans relied on river transportation by canoes (primarily because they uncovered large quantities of birch tar which was used to seal birch-bark canoes, but also because raindeer weren’t domesticated yet). But perhaps dog-sleds were used too?

  509. David Marjanović says

    Yakutia_LNBA Ymyyakhtakh are reported to have produced tin bronzes, and perhaps it was their access to tin which gave them the competitive edge early on. Rich cassiterite deposits are out there in Eastern Yakutia, in Yana and Indigirka Basins.

    Oh.

    at the outskirts of today’s Norilsk

    That’s insane. Our whole idea of the Bronze Age was fundamentally wrong from sea to frozen sea.

  510. J.W. Brewer says

    I’m not sure what “professional society” would mean in a medieval context but would e.g. the Knights Templar qualify? (The sources that say the ones who weren’t L1 French-speakers learned enough French to manage it as an intra-Templar lingua franca seem offhand like the most plausible ones to me.) See also this re how their longer-lived rivals broke down on ethnolinguistic lines: https://en.wikipedia.org/wiki/Langue_(Knights_Hospitaller). And yet when members from all over with different L1’s congregated in Rhodes or Malta or whatever (depending on the century) they must have had some sort of common L2 as a working language.

  511. @drasvi:

    i’m not sure whether we can talk about meaningful, much less clear, lines between craft/profession and ethnicity before the early modern period – and i tend to think that even there it’s a stretch. the tracking of particular lineage groups to particular crafts is very strong almost everywhere until very recently, to a degree where it’s hard to say which one defines the field of possibility of the other. it’s now much less rigid, and persists across generations less, but in a particular place and time it’s still a very strong dynamic: bodega owners in much of nyc as almost axiomatically yemeni and palestinian, for example, or the concentration of patels* as hotel operators in the u.s. southeast (very true in the 1990s, not sure how much so now).


    internally peaceful, multi-ethnic, amalgamating communities of local residents and long-distance travelers from different cultures

    this seems like a prototypical description of the kinds of non-state societies that have had such a long (if fluctuating) history in roughly that geography!

    .
    * itself a caste (hereditary and occupation-linked) label turned surname.

  512. J.W. Brewer says

    Within living memory (meaning my own, and I didn’t move to the NYC area until I was 27 years old), NYC bodega owners were stereotypically Korean-immigrant. But the strategy from the get-go was for their American-born kids to pursue higher-status economic niches rather than take over the bodega and pass it on in turn to their own kids. Thus the occupation rapidly transitioning to a newly-arrived “lineage group” in search of a niche.

  513. absolutely! i don’t expect the kids of the teenage afterschool-shift workers at my bodega to have the same jobs when they’re the same age. as i understand it, the rough (and overlapping*) sequence has been:

    – greek / yiddish jewish (in the candy store/corner grocery form, and the “deli” version with a sandwich counter);
    – puerto rican / dominican (marking the emergence of the “bodega” as we know it);
    – korean (mostly in the versions closer, in my worlds’ vernacular taxonomy, to a “greengrocer” than a classic bodega);
    – yemeni / palestinian.

    i’m very interested to see who inherits the role next!

    .
    * my first spot in brooklyn, in 2007, had an arab-run bodega on the corner, a korean-run bodega/greengrocer a block north, and a dominican-run bodega/deli a block south, plus a palestinian-run grocery store within a block.

  514. In San Francisco, Palestinians have been the stereotypical corner store (= ‘bodega’) owners for decades, except of course in Latino and Asian neighborhoods. The cheap coffeehouse renaissance, starting in the 1980s/1990s, was also largely due to Palestinians. Donut shops were almost all Cambodian, but these days there are some Chinese ones.

  515. >But perhaps dog-sleds were used too?

    All branches of Indo-European share cognate terms for the key element of a dog-pulled travois. That’s why theorists believe the family must have diverged shortly after the inventions of dog traction and spun pottery (whence our word wheel.). At least I think that’s the story.

  516. “Generally, however, it is not difficult to detect the clan allegiance of a website”
    (link)

    I have no idea if the article is interesting, I just found the phrase hilarious.

  517. I remember there was some… Austronesian? New Guinean? society where occupations were associated wtih ancestry [both paternal and maternal] but due to exogamy their situation was described by some anthropologist as : every member belongs to [nearly] all castes at once.
    I can’t remember what it was:(

    (possible objections to applying “caste” here are obvious – perhaps I just read it in WP “caste” but can’t find it there now)

    rozele, well, ethnicity itself is a fluid notion.

    I was thinking about modern examples of what Trond was speaking about, “multiethnic societies of metal workers.” Perhaps he simply referred to multi-ethnic societies of metal workers in the sense “possessed knowlege”, “there were some metal workers among them” – but I read it as a professional group that includes people of different ethnicities (and presumably crosses borders).

    Jews are an obvious example of a group who could cross (for example) the Mediterranean Christian-Muslim rift* (also there were e.g. fishermen – but I don’t know whether they succeeded in crossing military borders) but even though there were an are “Jewish professions”, what makes a person a Jew is not her profession.

    Perhaps
    (1) we could consider a group that practices marriage outside of the group less “ethnic”.
    (2) apprenticeship can be a way of recruiting member from outside the group (or also slavery)
    (3) there are men’s and women’s professions.
    Perhaps many of the latter are hereditary, but I’m not sure if marriage restrictions (like “a daughter of a [female] healer] must marry a son of a [female] healer and [male] blacksmith” were common for those

    * I’m frequently irritated with how the south and north coasts are seen as two different worlds with independent histories, where one of the two “histories” once spanned both shores but withdrew from the south with Muslim conquest…

  518. January First-of-May says

    but also because raindeer weren’t domesticated yet

    Reindeer. But apparently “raindeer” is an actual old spelling; and indeed Google Books finds many 18th and early 19th century examples.
    Apparently not related to either rain or (probably) reins. Previously on LH.

    I was surprised that the domestication of reindeer was this late; but of course 1200 BC is not exactly “late”. Apparently the usual estimates date it to the 1st millennium BC, so indeed slightly after that particular period – though even after domestication the idea of using them as draft animals might have been nontrivial.
    (OTOH AFAICT the “usual estimates” refer to regions well to the west of Taymyr; there were apparently two or three independent domestication events.)

  519. David Eddyshaw says

    Ethnicity is often associated with particular occupations in West Africa. Among Songhay, blacksmiths are often Tuareg, and griots are often Fulɓe, for example. Aid workers are traditionally French …

    Kusaasi don’t usually do cattle-farming; locally, that’s generally the preserve of Fulɓe or Mossi. However, the cultivator versus pastoralist thing is a whole other issue. (And the Fulfulde word for “cow” seems to have been borrowed by practically everybody in the savanna zone, unless you can manage to believe that it’s a Niger-Congo common inheritance.)

  520. Reindeer in Spein, dear… (not sure if English speakers are as familiar with the line as Russians)

    @DE, not only West Africa. Cf. https://en.wikipedia.org/wiki/Caste_systems_in_Africa

    _____
    Regarding the table in Madhiban#Cognate_castes_in_East_Africa (also in the link above in #Somali. “Cognate”?):

    I remmemeber, when reading about al-Akhdam/Muhamashīn in Yemen I found:

    Similarly, in correspondence sent to the Research Directorate, the President of LDDH [1] stated that [translation] “[t]he Al-Akhdam minority living in Djibouti is an Arab ethnic group from Yemen” (President 28 Dec. 2014). The Joshua Project, a research project that gathers information about ethnic groups around the world to support Christian missions (Joshua Project n.d.a), also states that the Al-Akhdam are Yemeni Arabs (ibid. n.d.b). Other sources note the presence of the Al-Akhdam minority in Yemen (IDSN 3 July 2013; UN 1 Nov. 2005).

    A former member of the Djibouti Republican Guard, who now lives in Belgium and who comes from a Yemeni Arab family, wrote to the President of ARDHD that, in addition to Yemen and Djibouti, the Al-Akhdam are present [translation] “in countries on the Gulf [of Aden], in Somalia and in Ethiopia, where they are called Midgans” (ARDHD 15 Dec. 2014b). Corroborating information could not be found among the sources consulted by the Research Directorate within the time constraints of this Response.

    I don’t know if the speaker implied merely functional equivalence of unrelated populations, or shared ancestry or what – some confusion is happening here. I guess Indian castes also are not “groups of people that share ancestry”, merely groups of groups who share.

  521. John Cowan says

    The OED found the OE cognate of hreinn, the ancestor of the first morph in reindeer exactly once:

    Compare Old English hrān ‘reindeer’, attested only in the late 9th-cent. account of Norway obtained by Ælfred from a Norwegian called Ohthere and interpolated in the Old English translation of Orosius:

    He [sc. Ohthere] wæs swyðe spedig man on þæm æhtum þe heora speda on beoð, þæt is on wildrum. He hæfde..tamra deora unbebohtra syx hund. Þa deor hi hatað hranas; þara wæron syx stælhranas, ða beoð swyðe dyre mid Finnum, for ðæm hy foð þa wildan hranas mid.

    So “these animals he called hranas“. It’s not even clear if this is a legit OE word that survived only here, or an etymological nativization of what Ohthere said, or what.

  522. Good point — that’s a very shaky basis on which to postulate an OE word.

  523. Trond Engen says

    Caste systems. I had already been thinking about how the three-or-so-part joint ventures of Sintashta and Seima-Turbino might have served as a prototype for the organization of society when the Proto-Indic peoples migrated south through Central Asia. It may all start as a mutual understanding. “If we keep doing this and you keep doing that, there will be room for both of us, and we’ll all be better off. And your sons marry our daughters, and your daughters marry our sons, and we will live happily ever after.”

    Sometimes such systems will break down after a few generations, without anyone missing them, as families intermarry and young people learn new trades. Other times they become institutionalized. In India they became — after an initial phase of intermarriage — endogamous sub-societies with inherent social status and hereditary access to positions and trades, enforced through millennia by strict moral codes and taboos.

    However common they are globally, I think the Early Bronze Age is the first time we really see them rise out of nothing and be clearly visible in archaeology as apparently stable local societies and regional systems organized around multi-ethnic alliances. But there’s a predecessor of everything, so maybe there will be more, and older, cases once we start looking. Sintashta had its predecessor in Abashevo-Fatyanovo. Corded Ware and Globular Amphora co-existed for many generations. Does it mean that they tried something similar? I they did, how did it play out? Did the Caspian harvesters and the Samara hunter-gatherers/copper suppliers form an alliance?

    What does the availability of this type of arrangement mean for the interpretation of some very different outcomes in the same era, like the near-full replacements of the male line in Iberia, Britain, and now far northeast Yakutia?

  524. These “joint ventures” sound very much like the kind of social setups you find in parts of the western Sahel – Fulani do the herding, Bozo do the fishing, Gabibi do the farming, etc.

  525. Dmitry Pruss says

    In the era of states, even in the antiquity, there was nothing unusual about ethnic specializations within the militaries (and, we can be assured, among the military suppliers). In the pre-state societies, of course there were ethnic specializations in both artisanship and ways of warfare, too, but we generally expect these specialties to be spatially isolated, and only combined by trade and by alliance-building. We generally don’t expect the pre-state artisans of different ethnic tripes to live, work, intermarry, and migrate together. That’s what made the “business models” of the Abashevo / Sintashta, and especially Seima-Turbino, so unusual. Not merely specialization in complicated crafts.

    And, yes, I noticed my misfortunately spelled reindeer, and SEVERAL misspellings of the spelling-defying Ymyyakhtakh, but it didn’t really impact the meaning and I was like, so be it. Surprised that nobody weighed in on the Yeniseian story BTW.

  526. Trond Engen says

    Dmitry: We generally don’t expect the artisans of different ethnic tripes to live, work, intermarry, and migrate together. That’s what made the “business models” of the Abashevo / Sintashta, and especially Seima-Turbino, so unusual. Not merely specialization in complicated crafts.

    Thanks. I struggled to formulate that.

    Surprised that nobody weighed in on the Yeniseian story BTW.

    I will. I just haven’t gotten there yet!

  527. There is also the system/division nomads-farmers-people of cities – discussed by Ryan (as a bipartite division, though) in the context of the Great Steppe, also in the context of Arabia and matching the traditional division of Arabic dialectology.

  528. Trond Engen says

    Me: But there’s a predecessor of everything

    E.g., Dmitry introduced the idea of a a possible Baikal genetic element in Uralic in 2020. I was sceptical. I was wrong.

    That discussion also mentions the Taymyr copper ore. There were actually two contemporary mining enterprises, one at the western side near the mouth of the Yenisei and one at the eastern side. They were operated by different archaeological cultures. The eastern one is Ymyyakhtakh, the western is some kind of Central Siberian. I have had two speculations about that. One is that the Yeniseians monopolized the river between the ore and the coppersmiths of the Altai (and an origin of the Yeniseians in the taiga west of the Baikal would be compatible with this). The other is that Proto-Samoyeds supplied Sintashta or Seima-Turbino through the Taiga belt. The two are not mutually exclusive.

    I also suggested Bronze Age Taymyr as the Uralic-Yukaghir contact zone. We’re now in the situation that both a deep common origin (in early Ymyyakhtakh) and extensive contact may be true.

  529. John Cowan says

    endogamous sub-societies with inherent social status and hereditary access to positions and trades, enforced through millennia by strict moral codes and taboos

    In other news, Bihar state in India has just held the first full caste census since 1931 (i.e. since independence) and the results were a jolt. When you add up the members of “backward castes” (socially and economically), “extremely backward castes”, “scheduled castes” (i.e. “untouchables”), and “scheduled tribes” (i.e. aboriginals) you get to 84% of the population, leaving only 16% as members of “forward castes”. The result is that even though 50% of political representation is reserved for these four caste groups, they are still under-represented. Now pressure is spreading to have a nationwide caste census and adjust many things accordingly.

    (Some have asked why representation should be proportional to the size of caste groups. The answer in general is that it is for the same reason that representation in a federal system should be proportional to the size of federal subjects: a lack of common interest.)

  530. Some have asked why representation should be proportional to the size of caste groups.

    We have similar bones of contention in NZ — and there’s a general election right now, so you can imagine the undercurrents from Parties of a certain colour.

    Half the Parliament is elected according to a national Proportional Representation by Party — so reserving a small number of the other half, geographic-based seats for Māori ends up having no imbalancing effect at all. If a Party were to get a disproportionally high number of seats from Māori-allocated geographical seats, they’d just get no Party-list (proportional-based) members; if that still doesn’t meet proportionality, _other_ Parties’ allocations of seats would get increased.

    Of course this sounds ‘dodgy’ to people who are disinclined to trust Democratic Institutions and/or arithmetic — which demographic aligns rather closely with supporters of the aforementioned “Parties of a certain colour”.

    a lack of common interest

    The whole awkward arrangement reflects the embarrassment of our Colonial set-up: a (deliberately) misleading translation of our founding Treaty never ceded ‘Sovereignty’ from Māori to The Crown, so the non-Māori continue to be squatters/illegal immigrants of a sort.

  531. Lars Mathiesen (he/him/his) says

    FWIW, the Danish system of proportional representation adds a small number of seats to each region according to their area, to make sure the Danes (Zealanders) don’t outbreed the Jutes. This can in theory break the arithmetic for extra allocations per party, in which case more members of Parliament are added like in NZ.

    (This hasn’t happened yet, but somebody got to spend X time implementing the rules, probably in ’58 COBOL).

    There used to be more edge cases involving FPTP single-candicate districts, but those are now amalgamated into 2-5 candidate local districts making it not impossible, but very unlikely, for that to happen. (If the numerically second-largest party were to get N-1 seats from a lot of those local districts, they might have more seats before proportionality than they were “supposed” to and we’d end up with an embiggened Parliament. There are multiple multi-page documents on the home page of the Ministry of the Interior setting out the rules in all their horrorglory, with worked numerical examples).

  532. David Marjanović says
  533. Trond Engen says

    I’m still digesting the papers.

    The Cis-Baikal population could just as reasonably be called the Angara people. If the similarity to a hypothetic source population of Native Americans means that they are related to people from further east, from before the formation of Ymyyakhtakh, their ancestors might have spread up the Lena and across the ridge to the greater river. That would allow for a Dene-Yeniseian homeland in central Yakutia. But I don’t know enough Siberian archaeology to know if this makes sense.

    As for knowing too little, I learned from the maps that there’s something called Samus-Kizhirovo that is contemporary with Seima-Turbino. The authors seem confident that the two can be joined without further explanation.

    Finally, I’m not sure the dates are clear enough to tell which came first, Seima-Turbino or Sintashta, or if the two really are two sides of the same coin, maybe with slightly different specializations..

  534. David Marjanović says

    Dene-Yeniseian homeland

    This article consists of a paper by G. Starostin being very skeptical of Dene and Yeniseian being each other’s closest relatives, followed by a reply from Vajda that pretty much agrees.

  535. David Eddyshaw says

    Interesting paper.

    That said, the typological argument on its own hardly means anything from the genetic point of view if the actual morphemes that occupy the morphological slots cannot be shown to share a common etymological origin in sound and meaning.

    Preach it, brother! (In particular, preach it to the “NIger-Kordofanian” fans.)

    I like Vajda’s conclusion (in which, it seems to me, he implicitly acknowledges that patient descriptive and bottom-up work is the the only really secure foundation of comparative studies):

    “Maybe the best way to demonstrate a language family is not to try so hard.”

  536. Stu Clayton says

    “Maybe the best way to demonstrate a language family is not to try so hard.”

    Hm. Not trying hard at all is a prominent feature of the ChomskIan approach, is it not ? No describing, just top-down aerobics.

  537. On Facebook, people mentioned V. Napolskikh of Tatarstan Institute of Archaeology as their go-to popular Uralic antiquity expert. At a first glance his 2015 book primarily capitalizes on the research by others, but may still be interesting, so here is a link to its pdf:
    https://core.ac.uk/download/pdf/235149815.pdf

  538. Samus-Kizhirovo can be searched for in Russian as самусь-кижировская культура. My first impression is that the term came into circulation relatively recently, but Wikipedia states that Самусь IV (Samus-IV) on Tom’ river in 20 km downstream from Tomsk has been excavated in the 1950s (and published vy the late 1960s) and again in the 1990s. Kizhirov Island is on the Tom’ river by the confluence with Samus’ka river.
    https://ru.wikipedia.org/wiki/%D0%A1%D0%B0%D0%BC%D1%83%D1%81%D1%8C_(%D0%BF%D0%BE%D1%81%D1%91%D0%BB%D0%BE%D0%BA)
    Samus’ represents the earliest Bronze Age on the Tom’ (XVII-XIV c. BCE), with intrusive technologically skilled populations supplementing local hunters-gatherers. The discoverers of the site concluded that the cultural influences were spreading to the North, that the ores were procured in the Altai mountains, probably traded for furs. The ornaments and sacred symbols of the Samus’ are quite famous locally in Tomsk, but probably not widely popularized elsewhere. See
    https://core.ac.uk/download/pdf/287437407.pdf
    and quite a lot of images online

  539. Trond Engen says

    David M.: This article consists of a paper by G. Starostin being very skeptical of Dene and Yeniseian being each other’s closest relatives, followed by a reply from Vajda that pretty much agrees.

    I remember the Starostin paper — we’ve probably discussed it here before — and I’ve been waiting for new development ever since. IIRC, Starostin points out real weaknesses, which are acknowledged by Vajda, but I didn’t find his argument for a larger Dene-Caucasian family very strong. With no other language families included in the hypothesis, as in Vajda’s work, I think it’s perfectly fine to name the hypothetical macro-family Dene-Yeniseian. If more families are included eventually, a new name can be assigned .

    I agree that Na-Dene and Yeniseian are very distant relatives (if related at all) — they must be, at that time depth, regardless of whether any other relatives exist or ever existed — but their common ancestor would still have to have been located somewhere at some time and spoken by somebody. With the linguistic evidence inconclusive, a lack of clear archaeological and genetic paths between them weakens the hypothesis, while, OTOH, a plausible path would mean that the hypothesis can’t be ruled out on archaeological grounds alone. And even if Dene-Yeniseian is a mirage, there were several waves of migration into America, and you have to pull in whatever thread you can to untangle the origins. But I should have said “That would allow for a hypothetical Dene-Yeniseian homeland in central Yakutia.”

    @Dmitry: Thanks. I’ll try my best to read them.

  540. David Marjanović says

    Me in July:

    the Black Sea.

    Most of it has actual ocean floor and is some 3000 m deep, IIRC.

    Correction: 2000 m. Map in Romanian.

    I didn’t find his argument for a larger Dene-Caucasian family very strong

    Most of that work is his father’s, accessible in semi-published form here along with Bengtson’s paper on the comparative grammar that we’ve also discussed before.

  541. “Crimeea” looks very funny.

  542. David Marjanović says

    These are people who write ydeea, so I was prepared…

  543. Dmitry Pruss says

    The “Dene-Yeniseian” observation in the genomic paper is very tentative, but may be worth mentioning. They tried modeling various ancient Native American genomes with ancient NE Asian sources and the models never required any proto-Yeniseian / Cisbaikalian sources to fit, with one major exception. Ancient Athabaskan DNA wouldn’t come out right is Cisbaikalian wasn’t added to the mix. The author don’t construe it as anything on the level of “proof” of Dene-Yeniseian, but note that at the current state of genomic knowledge of ancient DNA, the hypothesis can’t be ruled out and merits more investigation.

  544. Dmitry Pruss says

    The significance of Samus’ seem to be two-fold.
    – Tatarka-Hill is considered to be a Samus’ site, although typical Samus’ sites slightly postdate typical Seima-Turbino locations (and the paper posits that the migration, and mixing, of Seima-Turbino started from the earliest Eastern sites like Tatarka-Hill and continued until the Western-most studied sites like Satyga
    – Samus’ technologies are also associated with later Bronze Age cultures like Cherkaskul just East of the Urals (probably already discussed due to possible connections to proto-Finno-Ugric) and Iron Age cultures such as Anan’ino / Ananyino. If these connections are true, then it is likely that Seima-Turbino was an early wave of two or more waves of successive Uralic migrations out of the Ob-Yenisei interfluve, but not THE migration wave which succeeded in creating proto-Finno-Ugrics (that would have to wait until another spurt of similar migrations). So Samus’ would be close to the root but Seima-Turbina more of an exotic side branch…

  545. Trond Engen says

    Thanks again.

    I’m impressed by the graphics, but I didn’t figure out before yesterday how the “Figure Legends” and “Extended Data Figure Legends” relate to the graphics… And it didn’t strike me until this morning that I should look up the supplementary materials. Unfocused, I am.

    Anyway, here’s the doi link to the paper:
    Tian Chen Zeng et al: Postglacial genomes from foragers across Northern Eurasia reveal prehistoric mobility associated with the spread of the Uralic and Yeniseian languages, preprint
    There’s plenty of supplementary data, and I see that there’s already a new version of the pdf since Dmitry’s link. On first glance there’s not much difference, except that the graphics are in a different order.

    A few comments based on the first version and before diving into the supplementaries:

    On page 39 they highlight Cis-Baikalian and Yakutia_LNBA ancestry in modern-day populations with orange and grey stars over the admixture plots, but there’s no visible correspondence to anything in the plots. The map is beautiful, but I would also wish for color coding of the dots for linguistic affiliation,

    I really can’t find support for “the Dene-Yeniseian observation”. That doesn’t mean it isn’t there, but it’s not shown in a way I can understand.

    The doi to the other paper:
    Ainash Childebayeva et al: Bronze Age Northern Eurasian Genetics in the Context of Development of Metallurgy and Siberian Ancestry, preprint

  546. Dmitry Pruss says

    Verbatim:

    604 Yeniseian languages are related to the Na-Dene languages in North America under the Dene-Yeniseian
    hypothesis 76
    605 . We investigated this connection by using qpAdm to distinguish sources of APS ancestry in
    606 ancient (<4kya) Siberian and American Arctic groups that have been connected to present-day
    607 Yukaghirs, Chukotko-Kamchatkans, Eskimo-Aleuts, and Athabaskans (SI Section VI.B). We find strong
    608 evidence that all such ancient groups show at least partial descent from Paleo-Eskimo-related
    609 populations (represented by Greenland_Saqqaq.SG), and by extension Syalakh-Belkachi and other
    610 “Route 2” populations, except Athabaskans from ~1.1kya (SI VI.B.iiii-iv), consistent with some previous
    inferences 19,30,31, and contradicting other work, including from our own team 77
    611 . We instead find weak
    612 evidence that ancient Athabaskans may require a small quantity of ancestry from a population related to
    613 Cisbaikal_LNBA—genetic evidence for the linguistic hypothesis of a distinctive link between Yeniseian
    614 and Na-Dene languages. This suggestive result awaits corroboration with further sampling and more
    615 sensitive analytic methods.

  547. Trond Engen says

    Yes, I did see that, and I believe them.

    We instead find weak evidence that ancient Athabaskans may require a small quantity of ancestry from a population related to Cisbaikal_LNBA—genetic evidence for the linguistic hypothesis of a distinctive link between Yeniseian and Na-Dene languages.

    … is what made me write:

    The Cis-Balkaians are a surprise to me. Not as ancestors of the Yeniseians, especially, but as close kins of the ancestors of Ancestral North Americans.

    Upon which you replied:

    It isn’t so much about the homeland of American peoples, but about the similar principal ancestral streams which gave rise to both the Cisbaikalians and some early Americans. But the sources are much further East. What they call “Inland NE Asians” is exemplified by Yumin (8.4 kya from Inner Mongolia, only about 400 km from Yellow Sea). I assume that some Yumin-like peoples traveled West and others, North-East.

    If the source specific for Athabaskans is an ancestral stream related to the Yumin people and not especially close to Cisbaikalians, I would have expected the paper to say so, so I assume it’s more complicated. But how complicated? If the source of the shared stream were somewhere around the Altai or Yenisei, it could get to America along the coast, but if it’s eastern, I don’t understand how it got into Cisbaikalian and Athabaskan without also showing up in Yakutia LNBA. I wish I could see this hint of an ancestral stream into Cisbaikalian and Athabaskan for myself, i.e. digested and presented in colorful graphcs. As it is, I’m just confused about what it means.

  548. Trond Engen says

    And of course, this is not a gripe with you or your explanations, just me being frustrated with not being able to find support or substantiation within the paper. I guess I’m complaining that there’s not put even more effort into making an obviously extremely important paper accessible. But in fairness, this is a preprint, and we’ve seen before that the presentation is improved in the final version.

  549. I’ve been waiting for new development [in Dene-Yeniseian/Caucasian] ever since.

    Good timing: there’s a new exploratory lexicostatistic paper from an A. Kozintsev out just a few days ago. Yeniseian and Burushaski end up notably close, as already in the earlier discussion. There’s also some discussion about Basque appearing particularly close to Nakh, or the possibility of breaking up Na-Dene, but surely those have got to be noise…

    My main takeaway so far is however that, even if we accept proponents’ own claims about cognacy, Sino-Tibetan is the most tenuous member of the full Dene-Caucasian and looks to be connected more thru specifically Sinitic (+ Tujia, interestingly) than organically as a whole. What I’ve looked into the base data suggests a similar issue there too. ST is big enough that it’s easy to cherrypick a few forms that happen to resemble the Moscow School Northeast Caucasian reconstruction (the linchpin of the theory), and then assume that they’re more archaic in their shape and other, more different ST cognates aren’t. Basque, Burushaski and Yeniseian are small enough that this cannot be done and hence there’s more reason to think their resemblances with North(east) Caucasian cannot be just coincidence. Conveniently Sino-Tibetan is also a family that already has other competing macro-relationship hypotheses for it too, e.g. Sino-Austronesian, which could be some day furthermore pitted against Sino-Dene to see which works better.

    (I don’t really have any opinion so far myself on how well Na-Dene or Dene-Yeniseian fits in this overall picture — if it pulls Yeniseian away or helps push it closer; should develop that understanding some day though.)

  550. @JP: I don’t see any value in Kozintsev’s paper. How can you do lexicostatistics without demonstrated cognates? He quotes Starostin et al.’s database, but that only shows cognates in accepted families. I can’t tell how he derives any of his cognates, and he gives no examples. Right now even a single historically-related word shared between any of these groups would be news.

    Moreover, and to be unnecessarily nit-picky, distance methods are a crude approach to such delicate work. The NeighborNet diagrams, in particular, look like what you’d expect from a bunch of languages in internally clear but externally unrelated families.

  551. He’s following along, I think, with the Dene-Caucasian cognates already asserted by Starostin & co. (not in the business of asserting new ones at least), though yes the sourcing on that fact looks vague: they have a separate database for them from the one for Swadesh lists.

    NeighborNets of a family dividing in clear smaller units often look about the same too. Distance methods are crude but they might still have hints on what should be best priorities at looking at in closer detail, one hopes.

  552. Dmitry Pruss says

    At the ASHG today: the ancient Turks in their Altai craddle were largely composed of the local Bulan-Koby horsemen and blacksmiths, while the legendary Ashina clan, exiled to Altai after losing a fight with the Rouran in 460 CE, contributed only marginally to the early Turk genomes (which include the Avars). Many contemporary Taiga Turkic groups (such as the Tubalar, the Shor, and the Zabolotye Tatars) still carry overwhelming fractions of Bulan-Koby DNA, while Steppe Turkic peoples picked a variety of other components over time.
    The Bulan-Koby culture seem to be that of the legendary Tele people (who, together with the first Turks, finally had their revenge on the Rouran in 551 CE), but much of Tele’s homeland was further South-East in today’s Mongolia and the question remains somewhat open.

    Bulan-Koby (II-V c. CE) immediately postdate the Pazyryk “Scythians” after the defeat of the latter by the Huns and/or their auxilliaries, and fall into the spectrum of “Sarmato-Hunnic” cultures, but their material culture has a huge imprint of Northern Xiongnu (and a little bit, of Iranic Kushan-Yuezhi) and relatively little continuity with the Pazyryk, making some archaeologists conclude that Bulan-Koby were merely a branch of Xiongnu. But now, their DNA is predominantly Pazyryk, with an added Cisbaikalian Late Neolithic / Bronze Age layer. So the defeated Pazyryk may have surrendered much of its culture but contributed a lot of DNA to Bulan-Koby, and, in turn, to the Turks.

    1. https://twitter.com/DanTabin/status/1719933938355834981
    2. https://www.cambridge.org/…/5720C1A241C0646FB05C3895AA1…
    3. https://www.sciencedirect.com/…/abs/pii/S2352226718300047

  553. Dmitry Pruss says
  554. David Marjanović says

    the early Turk genomes (which include the Avars)

    Oh, interesting.

  555. Dmitry Pruss says

    @DM – rereading the text going with Fig. 2, I see that I oversimplified it about the Avars. By broader similarities of DNA fragments, Bulan-Koby formed a interconnected group with Turks and Mongols. Stronger similarities were observed in a tighter interconnected group which consisted of Bulan-Koby, early Turkic peoples, and the Avars.

    But the genomes of the latter didn’t “consist” of similar DNA (as was my first incorrect impression), they just possessed large amounts of similar DNA fragments.

    Probably the Eastern ancestors of the Bolan-Koby were also among the ancestors of the Avars

  556. Trond Engen says

    … and suddenly the geneticists have become able to locate exact source populations for historical ethno-linguistic groups. A few weeks ago it was the diagnostic male line in the spread of Uralic. Here we have the source of Western ancestry in early Turks.

    The variation between Saka groups is also interesting, a very similar Sintashta-BMAC mix and different local substrates. We knew that already, I guess, but now I’m going to expect that each ancestral element soon will be figured out with a location, a date, and a gender profile.

  557. David Eddyshaw says

    And a favourite colour.

  558. David Marjanović says

    All of that becomes harder in cultures that burn their dead.

    But not impossible…

  559. Warm climates also make it harder. And so do contemporary traditional proscriptions of the destruction of the bones. But the preprint of the Judah / Israel ancient DNA from the period of the destruction of the 1st Temple shows that sometimes, the difficulties can be overcome.
    The looted burial cave in Kiryat Yearim, by the time the salvage archaeologists descended on it, contained just a tiny corner of the cave still saved from destruction, and a heap of displaced bones. No direct signs of Jewish identity, so the ancient DNA extraction was greenlighted. But the pottery is typical Iron IIB / some Iron IIC, the late First Temple period of Judah. And the carbon dates,while imprecise, are in the 500s-600s BCE.
    The authors hoped to unveil the first data at an special event in Israel, but the war intervened. The first preprint on mtDNA just came out; they also reported one Y-chromosome, and are pushing ahead with the whole genomes.
    The initial impression is that the Judah people had a strong continuity with the earlier Canaanites.
    https://www.researchgate.net/profile/Arie-Shaus/publication/375004626_Ancient_Mitochondrial_DNA_Analysis_of_an_Iron_II_Burial_Cave_on_the_Slope_of_Tel_Kiriath-Yearim/links/653b5b7424bbe32d9a72187b/Ancient-Mitochondrial-DNA-Analysis-of-an-Iron-II-Burial-Cave-on-the-Slope-of-Tel-Kiriath-Yearim.pdf
    https://www.haaretz.com/archaeology/2023-10-09/ty-article/in-first-archaeologists-extract-dna-of-ancient-israelites/0000018b-138a-d2fc-a59f-d39b21fd0000

    As to the pre-Christian Slavs and their cremation burials, I hope that we’ll get answers from the occasional tradition-defying burials (such as mercenaries killed in foreign battles) and from the mixed-ancestry neighbors. Proto-Slavic connections have already been drawn from the dead of the Chimera battle on Sicily, and from the Goths (but likely part-proto-Slavs) cemetery in today’s Eastern Poland

  560. And there is another investigation relying on sharing of short DNA segments to reveal the roots of ancient peoples. Nothing but an abstract at this moment, presented at an archaeology symposium in Poland.

    “Genetic identification of Slavs in Migration Period Europe using an IBD sharing graph”

    Concludes that the earliest Slavs were very close to Late Bronze / early Iron Age Baltics, but quickly picked additional East Germanic and Scythian admixtures during their formative period.
    https://www.archeologia.uw.edu.pl/archeologia-i-numizmatyka-europy-wschodniej-2/
    The talk apparently mentioned “Scythian Farmers” as the implied source of the gene flow, which may be something like Belsk, Ukraine?
    https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0245996

  561. Trond Engen says

    Nice graphics. I should learn how to read these new clustering graphs.

  562. Dmitry Pruss says

    @Trond, there are unique strengths and weaknesses of the IBD methods (identity-by-descent where segments of DNA are the same in two individuals because of a shared ancestor long time ago).
    Segments of DNA, containing multiple variable sites, are much more diverse than individual variable positions, and therefore can pinpoint the origin of the ancestors with a vastly higher precision than one can tell from the same set of variations but considered separately, without taking into account their linkage to one another within a segment. That’s how modern consumer DNA test may tell you which province, district, and ever town your ancestors were from – and that’s in Europe, which is remarkably homogenous by DNA that few DNA variants appreciably vary in frequency even between the continent’s far-flung corners. Or sometimes the lack of geographic precision is also telling, like about 6 to 8% of my own DNA is marked as “Finnish” by its spectrum of individual SNPs, but doesn’t match ANY region of Finland by IBD (of course I know very well that my great grandmother Pelagea wasn’t from Finland, but rather, from Russia’s North, where the Russian-speaking population was predominantly of Finno-Ugric ancestry by their DNA)
    But IBD is also remarkably non-quantitative. Only a fraction of the DNA segments are highly informative about their origins, and it differs from location to location. On an individual descendant level, only a fraction of this fraction is actually inherited, and it’s highly variable too. When one compares a large population of descendants with a large population of ancestors, then these variabilities can be averaged and controlle