The Tocharian Trek.

August 14, 2023 by languagehat 226 Comments

As I said a few years ago, Tocharian is one of the Indo-European languages I’ve found most intriguing; now Ali Jones at Phys.org writes about a very promising project called TheTocharianTrek:

The research is helping to pin down where the Tocharians were located in the period between 3,500 BC, when they may have left their ancestral home, and their first written history in 400 AD. In sum, the initiative is mapping the migration route from the PIE homeland all the way to China.

Through the journey, the Tocharians brought their dialect of PIE into contact with people speaking different languages. This influenced and changed the way the Tocharians spoke until finally their recorded languages evolved. Archaeological and genetic evidence suggests that the Tocharians first moved to southern Siberia.

[Professor Michaël] Peyrot and his research colleagues have sought to provide a linguistic assessment of this route. Their work reveals that, indeed, some of the quirkiest features of the language fit very well with tongues spoken in southern Siberia.

“Languages preserve precious information about their prehistory through the effects of language contact,” said Peyrot. “Observing the effects of language contact, such as borrowed words, enables us to draw conclusions about the proximity of the speakers of different languages and at which point in time the contact took place.” As an example of a borrowed word, he cited a term for sword in a language strand known as Tocharian B: “kertte” was taken from “karta” in Old Iranian.

The research team has concluded that the Tocharians arrived in the Tarim Basin in around 1,000 BC—later than was previously thought. As result, their window of influence in the Tarim Basin has narrowed and the Tocharians are being assigned a more muted role in the prehistory of the area than they have traditionally been given.

Instead, the project has found a strengthened role for Iranian languages and peoples in the area, especially Khotanese, its relative Tumshuqese and Niya Prakrit. All influenced Tocharian.

The project is also piecing together which languages left the PIE community first and when. As their work enters its final phase, the researchers agree with the theory that the Tocharians may well have left the PIE family second and certainly well after the Anatolians, a group of ancient languages once spoken in present-day Turkey.

The piece goes on to discuss weather terminology and says:

The ultimate goal is to create an atlas that maps where the words were used and when. The completed atlas is due to be available on the university’s website beginning in late 2023.

Exciting stuff — thanks, Dmitry!

Comments

Y says

August 14, 2023 at 5:32 pm

This study by Peyrot, from 2019, argues for contact between Proto-Tocharian and Proto-Uralic, on typological grounds. I would be curious to know what Uralists think about it.
J Pystynen says

August 14, 2023 at 7:45 pm

The general gist of Uralic influence seems to hold up, I haven’t seen anyone outright question it, though many details remain debated.

Tocharian loanwords in Samoyedic remain probably underresearched; there’s been a decent number of recent proposals from TTT member Abel Warries and I’ve got a handful of unpublished finds myself around too. So far no finds of loanwords in the opposite direction though, interestingly enough! but maybe more lexicographic work will help with that.

Something getting from Tocharian even into Ugric (or into Hungarian, Mansi, Khanty separately) could be possible, has really not been checked systematically by anyone; there seems to be nothing already into Proto-Uralic though.
Trond Engen says

August 15, 2023 at 6:08 am

@Y: Thanks!

The uralicist hath spoken. I’ll just for the sake of discussion say that it’s interesting, but it’s based on internal reconstruction and should be backed up by other evidence. Another thought is that this looks like a Sprachbund with Yukaghir as a more peripheral member. One objection would be that it’s hard to imagine especially Yeniseian going through a Sprachbund phase without clear traces of syntactic borrowing or morphological leveling.
Christopher Culver says

August 15, 2023 at 7:21 am

I vaguely recall some scholarship which suggests that Yeniseian was spoken considerably further to the south in prehistory. In that case, there would be no need for it to be immediately adjacent to pre-Tocharian and pre-Samoyedic during their time of contact.
David Marjanović says

August 15, 2023 at 10:29 am

argues for contact between Proto-Tocharian and Proto-Uralic

No, between Pre-Tocharian and “an early form of Samoyedic”. (Open access! I read the paper a few years ago and recommend it.)

One objection would be that it’s hard to imagine especially Yeniseian going through a Sprachbund phase without clear traces of syntactic borrowing or morphological leveling.

When the grammar is just too different, such things may not happen. Basque and the surrounding Romance languages have basically been spending the last 2000 years approaching their sound systems to each other’s. Grammar? Nope. OK, Basque has been emphasizing “and” (ta) over the comitative case in the last 3 or so centuries, but if that’s all…

I vaguely recall some scholarship which suggests that Yeniseian was spoken considerably further to the south in prehistory.

I haven’t seen that outside of attempts to tie to, specifically, Burushaski and often also to the archaeological Karasuk culture.
Trond Engen says

August 15, 2023 at 12:21 pm

David M.: When the grammar is just too different, such things may not happen. Basque and the surrounding Romance languages have basically been spending the last 2000 years approaching their sound systems to each other’s. Grammar? Nope.

Point. It also occured to me that if Yeniseian instead is a common substrate, carrying the sound system over would be expected, bringing morphology along well nigh impossible.
Trond Engen says

August 16, 2023 at 4:39 pm

More on Afanasievo and Tocharians, starting with the comments:

Indo-European and the Yamnaya
They Perished Like Avars (a comment I meant to leave in another thread)
Son of Yamnaya (obviously)
Trond Engen says

August 16, 2023 at 5:48 pm

A study from last year that I hadn’t seen:

Kumar et al: Bronze and Iron Age population movements underlie Xinjiang population history, Science (2022)

Abstract
The Xinjiang region in northwest China is a historically important geographical passage between East and West Eurasia. By sequencing 201 ancient genomes from 39 archaeological sites, we clarify the complex demographic history of this region. Bronze Age Xinjiang populations are characterized by four major ancestries related to Early Bronze Age cultures from the central and eastern Steppe, Central Asian, and Tarim Basin regions. Admixtures between Middle and Late Bronze Age Steppe cultures continued during the Late Bronze and Iron Ages, along with an inflow of East and Central Asian ancestry. Historical era populations show similar admixed and diverse ancestries as those of present-day Xinjiang populations. These results document the influence that East and West Eurasian populations have had over time in the different regions of Xinjiang.

[…]

Discussion
Beginning with ancestral sources traced to indigenous ANE-derived and BA western and eastern Steppe populations, Xinjiang population structure can be characterized by waves of incoming gene flow and admixture from surrounding populations adding to the extant ancestry. The BA Xinjiang region contained four major ancestries, which included Tarim_EMBA1 (Xinj_BA1_TMBA1) (16), Afanasievo (Xinj_BA5_oAfan), Northeast Asian (Xinj_BA7_oEA), and BMAC (Xinj_BA6_aBMAC), with Tarim_EMBA1 possibly being indigenous to the region given its presence among diverse BA individuals (Fig. 3B). The Mongolian Chemurchek inhabitants near the Altai region were further linked to Xinjiang through the Chemurchek culture of BA northern Xinjiang, demonstrating the BA movement of people across the Altai region. Thus, we not only find support for both the Steppe and Bactrian oasis hypotheses (5, 13–15), but the identification of additional ancestries suggests further complexity of the EBA populations in Xinjiang. Additional sampling of pre-BA and EBA populations will be necessary to further characterize the succession of ancestries established in Xinjiang during this time period.

Later, in the LBA, ancestry present in the Central Asian BMAC populations becomes more pronounced, which was likely to have entered Xinjiang though the IAMC route along with the Steppe_MLBA populations, such as the Andronovo, Sintashta, and Dali (Botai related ancestry) populations (Fig. 3B). The entrance of Steppe MLBA (~3900 B.P.) into Xinjiang correlates with the arrival of the eastern Fedorovo subculture of the Andronovo (~3750 to 3500 B.P.) from the Tianshan Mountains (39). The IA is marked by an increase in movement and admixture of Steppe, Central Asian, and East Asian people into the Xinjiang region. The IA oversaw a continuation of Steppe_MLBA ancestry with greater genetic affinity to Central Asian populations containing BMAC ancestry. We also observed ancestries derived from South Asian Hunter Gatherer (Onge) in the LBA and IA, which suggests the movement of populations either from Central Asia already carrying this ancestry or from the Indus periphery region through the Pamir mountain regions into southern Xinjiang (11). Concurrently, an IA influx of East Asian ancestry from the eastern Steppe of present-day Mongolia is also observed, which may be tied to the westward expansion of the Pazyrk Xiongnu into Xinjiang. These admixed ancestries related to Steppe, East Asian, and Central Asian people established in the IA have been maintained since that time and are still prevalent in both the HE and present-day Xinjiang, linking the past with present-day populations. Whereas aspects of this reconstructed population history find support in the archaeological record, several insights can be gained by comparing the newly generated genomic data with previous archaeological and historical evidence.

First, although the diffusion of culture is not always accompanied by population movements (32), we observed an overall concordance between the two in Xinjiang populations. For example, the major genetic influences present at the earliest settlements of north and west Xinjiang populations can be related to the coexistence of people with different cultural backgrounds—e.g., Afanasievo, Chemurchek, and Okunevo (Fig. 3A and supplementary text). Also, the shift in population ancestries can be associated with proposed population movements. Specifically, the Afanasievo-related ancestry in BA individuals is consistent with the concurrent appearance of the Yamnaya culture in the Altai-Sai region, and the Steppe_MLBA ancestry in LBA individuals can be attributed to the expansion of the Steppe_MLBA culture into Xinjiang (see the supplementary text, which provides archaeological backgrounds). In the IA, genetic affinities with nomadic groups such as the Saka reveal the widespread presence of these groups across the entire Steppe region. Overall, we detect Tarim_EMBA ancestry surviving into the IA and HE, which implies the cohabitation of populations descended from Steppe EMBA, BMAC, and local Tarim_EMBA people in Xinjiang (Fig. 3B). These findings also give genomic evidence for the broad demographic processes underlying the spread of numerous languages in Xinjiang that have survived into historical times, such as the introduction of Tocharian languages by populations associated with the Afanasievo culture. An increasing mobility and movement of the Sakas in the IA and the establishments of the Saka states lasting into the HE aided in the expansion of Indo-Iranian languages, such as Khotanese, in Xinjiang as well.

Despite the widespread population movements documented here, the degree of genetic continuity that has been maintained in Xinjiang over the past 5000 years is noteworthy. Although genetic continuity has been observed in isolated environments or regions with relatively high cultural homogeneity, such as Northeast Asia (40), dynamic interactions between populations with diverse ancestries and cultures are more likely to result in major population shifts and turnover, such as in the Oceania archipelago and Europe (28, 41). However, this has not been the case for Xinjiang populations, where at least two different instances of genetic continuity are observed. The first is the genetic continuity (Steppe ancestry) from BA individuals to LBA and IA individuals, which represents a case in which a core Steppe ancestry has been maintained despite the addition of an extensive influx of diverse ancestries. The second case is the stability of Xinjiang population diversity from the HE to the present day, despite the turmoil of successive external ruling powers over the past 2000 years. A major reason may be that this mixed ancestry was prevalent not just in Xinjiang but throughout Central Asia, so dynamic population movements would not result in major genetic shifts. These findings indicate that genetic and archaeological evidence can provide distinct yet complementary insights into population history. This, in turn, further emphasizes the importance of a multidisciplinary approach to uncover the complex histories of regions like Xinjiang, where persistent interactions between various populations and cultures occurred.

The paper isn’t easy to follow, or trawl for details on single elements, but it seems that the Afanasievo, the Okuneva (who were their neighbours up north in the Minusinsk Basin) and the Chemurcheck (who took up nomadic herding on the Western Mongolian Plateau after learning animal husbandry from them) may have formed an economic and political community, with the Afanasievo as an initially dominant minority element. This multi-culture could well have become linguistically Afanasievan (Indo-European) with an Okunevan (e.g. Yeniseian) substrate. They spread out in a wide region around the Altai searching for pastures and presumably copper ore. We find them in the Early Bronze Age in the fertile Ili Valley in Western Sinkiang. With the advent of the Iron Age about 1000 BCE they seem to have been replaced or subdued by the Saka, and I don’t where they went after that, but they leave written documents along the northern rim of the Tarim basin, just across the Tian-Shan range, from around 400 CE.
Trond Engen says

August 16, 2023 at 7:17 pm

I learn that recent archaeology in the Ili valley divide the metal age into an Andronovo era (c. 1900-1000 BCE) and a Saka era (after c. 1000 BCE). That might weaken my assertion about continuity from the early Bronze Age, but I’ll note that there’s some debate (as always) over whether the Andronovo phase is echt Andronovo or a local development influenced by it. It may be more scathing that there’s no evidence of metal work older than c. 1900 BCE. But absence of evidence etc.

Zhi, Festa: Archaeological Research in the Ili Region: A Review, Asian Perspective (2020)
Wang et al: Copper metallurgy in prehistoric upper Ili Valley Xinjiang China, Archaeological and Anthropological Sciences (2019)
David Marjanović says

August 17, 2023 at 7:10 am

The paper isn’t easy to follow, or trawl for details on single elements

On the upside, it’s in open access, unlike the last few Science papers in this thread!
David Marjanović says

August 17, 2023 at 9:37 am

Remarkable amounts of Anatolian Farmer ancestry – up to 43%.
Trond Engen says

August 17, 2023 at 3:08 pm

Yes, in some. I don’t have the paper in front of me, but I think those are believed to be Sintashta (or maybe Andronovo) Indo-Iranians. I gave up trying to unite all data for each individual.
Trond Engen says

August 18, 2023 at 7:25 am

Me: We find them in the Early Bronze Age in the fertile Ili Valley in Western Sinkiang. With the advent of the Iron Age about 1000 BCE they seem to have been replaced or subdued by the Saka, and I don’t where they went after that,

No, that’s not it. Except from this single skeleton which they say is unadmixed Afanasievo, the other finds that are archaeologically Afanasievo or Chemurcheck are concentrated up north. These cluster together in the PCA plot. I’m confused because the “pure” specimen is rather like Sintashta. But in the admixture bars they all look the same. I guess there are things going on that isn’t shown in the plots.

Looking at the PCA plots I’m leaning towards the old suggestion that the Tocharians are the Wusun. They show up at the eastern end of the Silk Road, together with the Yueshi, a couple of centuries BCE, when the Han start pitting the steppe peoples up against eachother. Maybe this is the old alliance of Afanasievo and Chemurcheck?

Anyhow, the Wusun and the Yueshi start fighting eachother, and then moving westwards in turns under pressure from the Xiongnu. The Yueshi first took the Ili Valley from the Saka. Then the Wusun took it from the Yueshi. The Yueshi then went on to Bactria and founded empires, while the Wusun stayed and eventually were allowed to form a sort of buffer state between the Xiongnu and Han China, sinking gradually into obscurity. This is exactly the period when the Tocharian languages are attested.
languagehat says

August 18, 2023 at 7:53 am

This Wikipedia page has a “Chemurchek culture and contemporary cultures and polities” map, useful for those of us who have a hard time keeping in mind which culture was where. (Can’t link to the map directly — all the labels disappear.)
Trond Engen says

August 18, 2023 at 9:10 am

Or maybe the Tocharians are both the Yueshi and the Wusun, if the fallout and breakup were messy, if either A or B is a Yueshi stay-behind group, or if either represents the local delegation of the Kushan empire.
Trond Engen says

August 18, 2023 at 12:00 pm

That history of the Wusun and Yuezhi* peoples is just my summary of (mostly) Wikipedia. Ideally, this is where John Emerson would join in with all the Yuezhi details from Chinese, Mongolian, and Tibetan sources.

* Did I write ‘Yueshi’ up there? So I did. I devoice on your lousy Pinyin!
David Marjanović says

August 18, 2023 at 1:34 pm

Aren’t the Yuezhi more likely the Actual Tocharians, the ones who wrote an Iranian language in the Unknown Kharoṣṭhī script?

unlike the last few Science papers in this thread!

…in… the… Son of Yamnaya thread.
Trond Engen says

August 18, 2023 at 2:04 pm

The Kushan Script Deciphered this July. I was just pondering how to think about that.
David Marjanović says

August 18, 2023 at 2:37 pm

Oh. Remarkable how I half-remembered that and mixed it up with the knife reported on LLog.
AntC says

August 18, 2023 at 9:32 pm

(Can’t link to the map directly — all the labels disappear.)

Yeah, wp seems to have a real problem with maps. The thumbnails on the page are too small to read. Magnifying in-situ is temperamental[**]. (I got it to work eventually.) Clicking gives you a full-screen map with no labels — useless.

[**] And randomly magnifies other wp pages you might have open.
Lars Mathiesen (he/him/his) says

August 19, 2023 at 2:24 pm

Chrome, at least, and maybe Firefox, have the feature that the zoom level is shared between pages with the same host name. I’m not sure if “wp” here is Wikipedia or WordPress, but it would certainly be a thing for separate blogs hosted under wordpress.com. (Wheras blogs with their own boughten domain name would not share zoom levels, even if actually hosted on wordpress.\code/>com).
__________
(*) so called
Lars Mathiesen (he/him/his) says

August 19, 2023 at 3:06 pm

Ha, my trick worked (writing wordpress.<code/>com to avoid autolinking), but I misspelled it the second time and forgot to check inside the edit window….
David Marjanović says

August 20, 2023 at 4:16 pm

I vaguely recall some scholarship which suggests that Yeniseian was spoken considerably further to the south in prehistory.

I haven’t seen that outside of attempts to tie to, specifically, Burushaski and often also to the archaeological Karasuk culture.

Oops, sorry, that depends on how considerable “considerable” is. The evidence from hydronyms is clear that Yeniseian drifted north in historical times; but the names don’t extend as far south as where the Karasuk culture once was, IIRC.
languagehat says

December 22, 2023 at 10:51 am

See now “Tocharian Bilingualism, Language Shift, and Language Death in the Old Turkic Context,” by Hakan Aydemir (Sino-Platonic Papers 337 [Dec. 2023], open access):

[…] we do not know at all what happened to the Tocharians, when, how, and why they disappeared, or when the Tocharian languages died out. This study tries to solve these fundamental questions from the perspective of Turkic historical linguistics. In connection with this, the Turkic background of Tocharian-Turkic interethnic and linguistic contact is first examined in order to understand how the Tocharian-Turkic language contact came about in the first place and how long it lasted.
AntC says

December 22, 2023 at 7:55 pm

Thanks @Hat for linking that. I have questions … (about relying on soundalikes for random seemingly everyday words, rather than specialist vocabulary like horse equipment or thole pins). But also I have ‘form’ over there which makes me reluctant to even ask.
AntC says

December 22, 2023 at 9:06 pm

uTexas announcement (a year+ ago) of Dr. Aydemir’s research plan.
David Eddyshaw says

December 22, 2023 at 10:51 pm

Publication in the Sino-Platonic Papers does not inspire confidence: it is the home of some really quite extraordinary nonsense, and edited by a man who has a very large and very evident axe to grind, and who lacks any comprehension of the basics of historical linguistics.

Still, the paper deserves to be looked at on its own merits. (An evident proto-Oti-Volta cognate did catch my eye, though …)
AntC says

December 23, 2023 at 9:36 am

… the paper deserves to be looked at on its own merits.

Sadly, it seems the paper that deserves looking at is not this one, but another “ This period, together with other linguistic data, will be covered in another article. ” [p. 9 — all too ‘Deep Thought’] So those wanting posited cognates can inspect only the fewer than a dozen single-syllable roots pp 7 ~ 9.[**] (Which anyways date to very early in posited contact; you’d expect borrowings/influences from the different episodes of contact over the several centuries the paper claims.)

Those few pages were what raised my questions above. I assumed at that stage there’d be much more in an addendum or somewhere.

What I’d expect (but tell me I’m being dumb) is comparanda showing this form doesn’t fit the sound pattern of language X, so must be a borrowing from language Y, which it does fit. And/or this form is only in Turkic languages that were in close contact with Toch; other Turkic languages show a quite unrelated form. I don’t even get a strong feel for the direction of borrowing.

There’s a bunch of possible ethnonyms/toponyms in Turkic languages of the Tarim basin possibly showing Toch influence.

[**] Oh, those who hold their breaths until cont. p42 get

I assume that the OTu. op in Argu dialect is a reflex of the TochA ops- ‘ox.’ Namely, according to Pinault, ops- goes back regularly to the form *ops(o) ( TochB okso ‘ox’, i.e., -ps- < *-ks-).

Really? The reflex of PIE *uks-en is TochB ‘okso’? That seems all _too_ neat.
Trond Engen says

December 23, 2023 at 11:29 am

I had the same immediate reactions as David E. and AntC, and this morning I just stopped reading after a few paragraphs, underwhelmed by the borrowings in Old Turkic. I’ll take it up again after Christmas, when days are long and slow and I have more patience.
David Marjanović says

December 23, 2023 at 1:10 pm

I’ve downloaded the paper, but haven’t had time to even begin reading it.

Does it mention Iranian/Iranic? There must be a decent helping of words from that family that were borrowed into both Turkic and Tocharian.
AntC says

December 24, 2023 at 9:39 am

haven’t had time to even begin reading it

You might do best to start with the Summary pp 74 ff, especially the posited timeline. (Apologies if I’m ‘teaching my grandmother …’.)

Does it mention Iranian/Iranic?

It should also be noted that while a significant part of the Tocharians became Turkicized, another part became Iranianized, and another part became Sinicized. However, their Iranianization and Sinicization processes will be discussed in a separate study. [p. 5]

The ‘meat’ being “in a separate study” seems to be a recurring pattern. The guy’s a Turkicist not an Iranianist, so reasonable, I suppose.

That said, there are plenty of mentions in the footnotes of findings/speculations[**] from Iranianists.

What worries me methodologically is the risk of finding a borrowing into Turkic and presuming it’s from Toch (and taking that as evidence Toch was still spoken), without first eliminating the possibility it’s from Iranian.

[**] Indeed many of the mentions of Iranian are suggestions of mis-attributions: what had been thought as an Iranian borrowing into Turkic (or v.v. or into Toch) is more likely Toch into Turkic — note 121 p. 47, for example. Indeed on a closer reading, I tend to think the footnotes carry more ‘meat’ than the text.
AntC says

December 24, 2023 at 10:38 am

a) Long after Toch lost prestige as a Buddhist liturgical language, it continued as a vernacular, gradually losing ‘grammaticalisation’.

b) Although TochA might be the older IE variant, it continued in parallel/in different areas vs TochB and indeed outlasted TochB. the Argu were undoubtedly the descendants of Yuezhi, and their language was TochA. [p. 79]

c) That’s why there was still in C10th enough knowledge of TochA [p. 23] to translate key Buddhist texts (Maitrisimit and Daśakarmapathāvadānamālā) into Uyghur — albeit with some misunderstandings of the grammar.[**]

[**] like modern English misunderstandings of Shakespeare or KJV?

But please explain: if you’re translating Buddhist texts why not start from the Sanskrit?
Trond Engen says

December 24, 2023 at 10:57 am

1. Nobody knows Sanskrit.
2. The Tocharian text is canonical or sacred in its own right.

Compare the Vulgate in the Cactholic tradition.
Compare the KJV in some Protestant domains.
Trond Engen says

December 29, 2023 at 11:12 am

I have read the paper. It’s not so much a research paper as a synthesis hypothesis or an outline of his ongoing work, so the Sino-Platonic Papers may be a good place for it.

Main points:
– The Yuezhi are the Toch A (Yarki). They ruled a state where the population consisted of both Toch A and Toch B groups.
– The historical Tocharians are the Toch B (Tukhar). They ruled the western part of the basin
– The mobile elites of both states were chased out by the Huns and fled first to the Ili Valley and then to Bactria in the second century CE.
– The societies rebound, united by Buddhist religion and a monastic economy,
– The languages took the first blow after the Tibetan conquest of 840 CE. The Tibetans were Buddhists too, so the religion and the monastic economy survived and suffered through the Uighur-Tibetan wars that followed.
– The next blow came with the Uighur conquest of 866 CE. The Tocharian elite may have become Manichaean and Uighur-speaking, but Buddhism and Tocharian spread from below, until the Uighur king converted.
– The Uighur state was bilingual. Uighur was used for official and external communication and Toch A and B in everyday situations and religious rituals.
– The Western basin was conquered by the Karakhanids in 1040, and islamization followed. The Toch B reorganized as a tribe and kept a separate identity within the tribal system for many generations. So did a tribe of Argu, likely Toch A from the Ili Valley. Both were bilingual.
– The late texts in bad Tocharian reflect a period when the language was no longer spoken by the writing class but still used in Buddhist rituals.
– Tocharian finally died out in the Karakhanid realm around 1200 CE, in the Uighur realm before 1300 CE. Paradoxically, Toch B seems to have survived longer than Toch A in the Uighur east, while Toch A survived longer than Toch B in the Karakhanid west.

The case relies entirely on reinterpretation of written sources, and he suggests that genetics, archaeology and thorough lexical work in the rural regions of Xinjiang and eastern Kyrgyzstan and Kazakhstan might clarify the matters.
languagehat says

December 29, 2023 at 11:20 am

Thanks very much for that summary! Interesting and plausible, at any rate.
Trond Engen says

December 29, 2023 at 9:00 pm

Synchronicity strikes again. Here’s an article putting genetic sequencing for ethnic origin in grimmer light:

Amy Hawkins, The Guardian: Academic paper based on Uyghur genetic data retracted over ethical concerns.

The article mentions lack of informed consent by the participants, but the ethical problems seem much deeper than that.

This isn’t exactly about ancient DNA, but it’s about using genetic anthropology in criminal investigations — and potentially about using genetic screening to isolate ethnic groups — and that makes me very uneasy. Maybe Dmitry can throw some light on the issue.
AntC says

December 30, 2023 at 10:42 am

@Trond The case relies entirely on reinterpretation of written sources, and he suggests that genetics, archaeology and thorough lexical work in the rural regions of Xinjiang and eastern Kyrgyzstan and Kazakhstan might clarify the matters.

Yes, thank you for your ‘due diligence’.

It was not clear whether the “thorough lexical work” is already further advanced than the too few examples in S-PP. Are we about to see another paper in a more ‘technical’ forum?

I appreciate many lines of research would like genetic data wrt Uyghur ethnology — especially before PRC wipe them out completely — but I’m appalled at Elsevier and OUP (alleged). “ help the police identify suspects in cases” my foot! As far as Chinese State police are concerned, if you’re Uyghur, you’re guilty.
David Marjanović says

July 26, 2024 at 4:25 pm

I just found this open-access follow-up paper by the abovementioned Abel Warries to Peyrot’s abovementioned 2019 paper on the suspiciously similar vowel systems of Pre-Proto-Tocharian and Pre-Proto-Samoyedic.
David Marjanović says

July 27, 2024 at 4:46 pm

I just finished reading it, recommend it, and note it cites two blog posts by Juho Pystynen for support.
Trond Engen says

July 29, 2024 at 3:15 pm

Warries’ paper does a great deal to sort out the remaining questions in Peyrot, ending up with near-identical phoneme inventories in the two languages and a seemingly plausible timeline for both. That’s neat, but identical phoneme inventories don’t just happen even in close contact. I’d like to see phonological development in one language explained with the phonology (including phonotactics) of the other.

The suggested loanword Toch B yasa “gold” < PT *wʸəsa < pre-PT *wesa < pre-P-Sam *wäsa “metal” < PU *vaćka is important if it holds out. It would be informative for the timeline of changes in both languages and for the environment where they interacted. But it’s just one word.
David Marjanović says

July 29, 2024 at 7:49 pm

I’d like to see phonological development in one language explained with the phonology (including phonotactics) of the other.

Apart from the phonotactics, that happens near the end of the paper – the IE vowel system getting reinterpreted in Samoyedic terms.

PU *[w]aćka

As the paper mentions, it’s probably not actually PU, but a Wanderwort circulating between freshly separated Uralic branches – and ought to have spread during everyone’s favorite Seima-Turbino Phenomenon.

By the way, Toch B yasa < PT *wʸəsa betrays initial stress. PT *a gives B stressed ā /a/, unstressed a /ə/; PT *ə gives B stressed a /ə/, unstressed ä /ɨ/. It looks like the word had initial stress, which fails to contradict the hypothesis that it’s a Uralic loan.

(Tocharian stress, although never directly written, is understood pretty well now. I’ll try to look for the paper(s).)
Trond Engen says

July 30, 2024 at 8:55 am

David M.Apart from the phonotactics, that happens near the end of the paper – the IE vowel system getting reinterpreted in Samoyedic terms.

Yes, to a limited degree. What I meant is that a great deal of development is assumed to have happened independently before the languages entered the period of contact with already very similar — and rich — vowel systems. This means that contact isn’t used to explain the changes, only to time them. If instead, say, phonemization of reduced vowels could not be well explained language-internally, but would be a simple reinterpretation of allophones in one language based on distinctions made in the other, the case would be stronger. If that’s the case here, I can’t read it from the paper.

PU *[w]aćka

If PU *vaćka was a borrowing in (post-)PU, it would still have been borrowed into Tocharian from Samoyedic.
J Pystynen says

July 31, 2024 at 12:01 pm

Hungarian vas << *waś[k]a would seemingly come about as close, but that’s probably already too far west.

I imagine Warries may have missed by a small margin my also fall 2022 presentation about vowel reduction in Samoyedic alas, which has a few additional observations.
David Marjanović says

August 1, 2024 at 3:19 pm

What I meant is that a great deal of development is assumed to have happened independently before the languages entered the period of contact with already very similar — and rich — vowel systems. This means that contact isn’t used to explain the changes, only to time them. If instead, say, phonemization of reduced vowels could not be well explained language-internally, but would be a simple reinterpretation of allophones in one language based on distinctions made in the other, the case would be stronger. If that’s the case here, I can’t read it from the paper.

The paper is quite long-winded, detailed and comprehensive and explores alternatives in full. But Table 11 presents the conclusions quite clearly. Most striking is the reinterpretation of (*iH *uH >) *ī *ū as *ij *uw, sequences that are imaginable in Uralic (and must have been present in a pre-Samoyedic stage in the right timeframe) but not in IE (unless much farther-reaching changes to the sound system have already happened) where they would have been **/jː wː/ in a system that didn’t allow long consonants. (The dative singular of the PIE *i-stems, *-ej-j, did not come out as **-eji but as *-ēj, no different from the animate *s-stem nom. sg. *-os-s as *-ōs or the *m-stem acc. sg. *-om-m as *-ōm.) The outcome of *o, even though that may not have been a rounded vowel in Proto-Indo-Tocharian, as something that was able to merge with *ē as Proto-Tocharian *e, without taking *a or *ɨ or anything else along for the ride, may be even harder to make sense of without considering it an adaptation to the Proto-Uralic through Proto-Samoyedic *ɤ (Uralicist tradition: e̮; “Uralic Typewriter Alphabet” and tocharologist tradition: ë).

I also wonder if the split of PU *ɤ into PS *ɤ and *ɯ was motivated the other way around, as an adaptation to the Pre-Tocharian contrast of *o and *ɨ…
David Marjanović says

August 1, 2024 at 4:39 pm

(Tocharian stress, although never directly written, is understood pretty well now. I’ll try to look for the paper(s).)

I must have had this in mind. (Long and in not quite native German.)
Stu Clayton says

August 1, 2024 at 5:53 pm

At that link I discover that scaricabile means downloadable. I haven’t encountered a Spanish adjective for that – the locution I know is se puede descargar.

descargar also mean unload. Things gotten from the internet might have fallen off the back of a truck, for all some people care.
M says

August 1, 2024 at 9:33 pm

“downloadable. I haven’t encountered a Spanish adjective for thatl”

descargable (recognized by the Royal Spanish Academy, widely used, and probably universal in the Spanish-speaking world).
Stu Clayton says

August 1, 2024 at 10:40 pm

I believe you. It remains a fact that I have not encountered the word.
Trond Engen says

August 2, 2024 at 2:56 pm

David M.: Table 11 presents the conclusions quite clearly. Most striking is the reinterpretation of (*iH *uH >) *ī *ū as *ij *uw, sequences that are imaginable in Uralic (and must have been present in a pre-Samoyedic stage in the right timeframe) but not in IE (unless much farther-reaching changes to the sound system have already happened) […]

Yes, I follow the logic of the sound changes, and I probably sounded more critical than I meant to be. Maybe what I miss is a clearer comparison with Peyrot’s timeline, not as sequences of sound changes but as independent versus contact induces changes. That could even allow a discussion of (the development of) the substrate influence and what that might mean for the Samoyed-Tocharian contact situation.

This is relevent because I’m not convinced that the contact process involved language shift from Uralic, i.e. an actual Samoyedic substrate in Tocharian. Other scenarios that might be entertained are a sprachbund with Samoyedic as lingua franca and a Samoyedic superstrate/adstrate with sociolinguistic prestige. That doesn’t matter for the purely linguistic evidence, but it might for timing and for comparison with archaeology.
Trond Engen says

August 4, 2024 at 6:25 pm

To be a little more specific: Afanasievo seems to disappear from the Minusinsk Basin around 4500 BP, being replaced by the Okunevo culture. This culture is very different from Afanasievo, instead continuing the customs of the local foragers, but it still seems to have some 10-20% specifically male ancestry from Afanasievo. This does look like a situation where Afanasievo could prevail on an unrelated substrate, but the Trans-Baikal ancestry that will later spread with Seima-Turbino and become diagnostic of Uralic makes its first appearance west of Lake Baikal — just in this region — about 4200 BP.
Trond Engen says

August 5, 2024 at 2:23 pm

Calm summer days. I’ll just pretend there are readers and keep going.

I see three reasonably simple substrate-forming events. I could come up with infinitely many more, but they would be more convoluted and hence less likely until the simple scenarios are ruled out.

(1) The Afanasievo settlement in the eastern forest steppe around 5500 BP. For the hypothetical substrate to be pre-Proto-Samoyedic, Uralic must have started differentiating some time before that, say 6000 BP. Untocharianized Samoyedic must have been spoken close by, and the remaining “Core Uralic” not too far away, since I don’t think there’s evidence of other long-distance migrations in this region in this era. When the Trans-Baikalians arrived more than a millennium later, they seima-turbinized Uralic and uralicized themselves before spreading quickly westwards along the forest steppe.

(2) The replacement of Afanasievo with Okunevo around 4500 BP as sketched in my previous comment. In this case, Samoyedic might have kept developing in the closely related cultures around the Irtysh that didn’t become Okunevo but eventually became affiliated with Seima-Turbino. Otherwise, (1) and (2) are pretty similar and not mutually exclusive, except for the nature of substrate (1) and the implied age of Proto-Uralic.

(3) Tocharianization of (a part of) the Seima-Turbino-affiliated culture in the Altai foothills. This must have happened some time after the beginning of Seima-Turbino, so perhaps 3500 BP. In this case Proto-Uralic could be defined by the starting point of Seima-Turbino. Otherwise, again, this is not mutually exclusive with (1) and (2), with the same qualification.

There are weaknesses with all three of them.
jack morava says

August 5, 2024 at 2:47 pm

As a heavy metal fan I am happy to learn about Seima-Turbino, pretending to have readers is unnecessary effort; thanks, sincerely
languagehat says

August 5, 2024 at 3:13 pm

I am reading with interest but have nothing to say as of the current moment. And you know what Wittgenstein said.
Stu Clayton says

August 5, 2024 at 3:33 pm

pretending to have readers is unnecessary effort

Readers read, writers write. Only money talks in the text business.
Trond Engen says

August 5, 2024 at 4:00 pm

Yeah, I’m just being cute.

Some scenarios for adstrates/superstrates:

(A1) Situation (2) above, except that Okunevo didn’t become tocharophone. The families with Tocharian fathers kept in contact with the remaining Tocharians, now Chemurchek steppe nomads, whose speech slowly adjusted to the not-longer-primary-language of their settled relatives.

(A2) Situation (3) above, except that there was no language shift to Tocharian. The peoples around the Altai used the language of the metal workers as lingua franca. Speaking with a Seimo-Turbino accent after spending time in Seima-Turbino settlements or taking part in Seima-Turbino expeditions was attractive,

(A3) The emerging Chemurchek culture was a multi-lingual alliance of nomadic peoples that used pre-Proto-Samoyedic as common language until eventually the covert prestige language Tocharian prevailed.

The list is not exhaustive.
Trond Engen says

August 5, 2024 at 4:07 pm

What does this mean for the timelines?

If, as argued in 2023, (a part of) Chemurchek was the vehicle that brought Tocharian to the Dzungarian Basin, then a Samoyedic substrate must have been acquired before 4500 BP.

The intitial phases of the development of Uralic would have taken place in the Irtysh-Yenisei region before the Trans-Baikalian prestige group arrived and Seima-Turbino got going.

An adstrate or superstrate in Tocharian could be introduced later, by more mechanisms, but even this window isn’t wide open.
David Marjanović says

August 5, 2024 at 4:55 pm

If the onset of Seima-Turbino is the onset of contact between Uralic and Pre-Indo-Iranian, it’s too late for Proto-Uralic, because Pre-II contact appears to have happened entirely with separate branches of Uralic.
Trond Engen says

August 5, 2024 at 5:30 pm

Yes, that too. Based on current dates, Seima-Turbino spread from east to west ~4200-4000 BP, exactly contemporary with Sintashta and Andronovo from west to east. If that holds out, it’s hardly a coincidence. Whatever way that happened, I assume that contact with neighbouring Indo-Iranians was established separately for each Seima-Turbino group, and that Uralic started to disintegrate immediately while Indo-Iranian held together for longer.
Bathrobe says

August 5, 2024 at 6:12 pm

Language Log has a post about “Yuezhi archeology without concern for Tocharian language” by none other than Victor Mair (https://languagelog.ldc.upenn.edu/nll/?p=65252). I found it long and very hard to follow. Mair’s point appears to be that the Chinese are again trying to rewrite history from a Sinocentric perspective, although I had great difficulty following the chronology and movements. Our own DM wrote a comment that Mair immediately slapped down.

A second post has since appeared: “Rethinking the Yuezhi?” (https://languagelog.ldc.upenn.edu/nll/?p=65255), a guest post by Craig Benjamin.
Y says

August 5, 2024 at 7:05 pm

Our own DM wrote a comment…
A clear, informed, and interesting one. I would be happy to get this kind of comment on a peer-reviewed manuscript.

…that Mair immediately slapped down.
Pissily and rudely, for no reason.
Trond Engen says

August 5, 2024 at 7:55 pm

I’m reminded once again why I got tired with Language Log.

When it comes to agendas, I fear it’s more what they choose not to look for and may inadvertently destroy in the process. By sheer coincidence (named Dmitry) I had occasion today to read about the Henan Longshan culture in China. This was a third millennium culture on the middle Yellow River with massive interaction with nomadic peoples — very much including the Chemurchek — during what arguably are the formative years of Chinese civilization. I found reference to one genetic study of this culture, and even if it’s as recent as 2022, it’s of mt-DNA alone, which (as expected) is overwhelmingly local. An isotope analysis from this year OTOH tells of long distance mobility especially among women.
Bathrobe says

August 5, 2024 at 7:59 pm

As I noted, Mair appears very concerned that China will use this to rewrite history. For instance, he quotes this from the original Wall Street Journal Article (China Reaches Back in Time to Challenge the West. Way, Way Back): “Asked whether Beijing could use the Yuezhi to make territorial claims, Wang said the notion was absurd because the nomads are a historical people and no one serious would put forth that argument”.

The Wall Street Journal article and Mair’s comments both frame the arrival of Chinese archaeologists in Uzbekistan in terms of a challenge to the West and the rewriting of history, including the dark intimation that his discoveries might lead to new Chinese territorial claims. While that doesn’t seem relevant in this case, Chinese attempts at rewriting history should never be taken lightly. The Chinese claim to the South China Sea is a wholly manufactured one based solely on interpretations of history by Chinese nationalist historians in the ROC period. The recent successful Nantes museum exhibition about Genghis Khan should also give pause for thought. When the Nantes museum attempted to source material from museums in Inner Mongolia, the Chinese side insisted on imposing their interpretation of history on the exhibition. This wasn’t made very clear at the time, but it later became apparent that the Chinese wanted to retitle the exhibition “China’s Steppe Empire”. The narrative that Genghis Khan was “Chinese” (as part of the Zhonghua Minzu) is alive and well in China — even though the Communists’ favourite author, Lu Xun, shot it down almost a century ago.

one of their burial forms, defleshing of the dead

I assume this is what is known with regard to the Tibetans as “sky burials”.

@ Trond

massive interaction with nomadic peoples

There has always been massive interaction with nomadic peoples. The problem is the spin that is put on this. The Chinese narrative is that these nomadic peoples have been successively and successfully absorbed into mainstream Chinese culture. This gives rise to the concept of 民族融合 mínzú rónghé, or fusion of peoples, which takes the general stance that all backward surrounding peoples are destined to be absorbed into China’s superior agrarian, urban civilisation. It is this mentality that Mair is resolutely opposed to.

Originally the absorption of surrounding peoples was thought to be a long-term, natural process that would inevitably unfold. The new view that has been adopted by historians who have influenced Xi Jinping is that this process could and should be accelerated by government policy.
AntC says

August 5, 2024 at 8:34 pm

might lead to new Chinese territorial claims …

The Belt & Road Initiative is all about pouring money into impoverished countries, to lock them into Chinese hegemony. With the ‘-stans’ the CCP’s first step is to unlock them from Russian influence — not difficult with Putin distracted elsewhere; and avoid them falling under Islamic influence. Then where else will they have to go? Being landlocked and all … The CCP plays a long game.

Wang said the notion was absurd …

Poor dupe. Has he never met Xi Jinping thought?
Dmitry Pruss says

August 5, 2024 at 10:10 pm

Yes, when I opened the paper about Shimao one industry, I imagined that it was their exports ware to the pastoralist North… but what surprised to find out how peripheral and unusual was this town with respect to the rest of the Late Neolithic Loess plateau, and how transitional it appeared to be the Steppe. But I couldn’t figure out if there were archeological parallels
Y says

August 5, 2024 at 11:03 pm

Back in the day, countries would cleave to their patrons through instilled ideology (communism, capitalism), on top of the material benefits. Does the Belt & Road Initiative provide any ideology with which to attach people to China?
AntC says

August 6, 2024 at 1:47 am

@Y I would be happy to get this kind of comment on a peer-reviewed manuscript.

Indeed. The site gets little enough informed commentary. (I include my own commentary as mostly ill-informed. Seeking elucidation; rarely receiving much.)
AntC says

August 6, 2024 at 2:13 am

any ideology with which to attach people to China?

Most of the ‘attachment’ derives from bribing heads of government in corrupt (pseudo-)democracies. Pacific Island nations “switching allegiance from Taiwan to China”. “China to attempt to meddle in a national election later this year”
Trond Engen says

August 6, 2024 at 2:21 am

@Dmitry: I haven’t actually read the Shimao paper yet. I had a brief look, decided I needed a better grip of the context, started reading about the cultures of the loess plateau, and realized the relevance for the Afanasievans.

The transitionality is indeed interesting. The Longshan had pigs, chicken and millet and got cattle, sheep&goats, wheat and barley from the west. In about 2000 BCE they retracted to the lowland and started naming dynasties.

@Bathrobe: Fair point. But who are these “historians” giving advice? It sounds more like thinktankery.
Bathrobe says

August 6, 2024 at 6:55 am

I can’t find anything that will introduce you to the public intellectuals who have been pushing for “second-generation ethnic policies” in China, but I have seen articles mentioning the people involved. One that sticks in memory is a gentleman called Ma Rong. Ma is of the Hui nationality but is critical of earlier ethnic policies (see this article: https://www.readingthechinadream.com/ma-rong-ethnic-regional-autonomy.html). He precedes Xi Jinping.

There is quite a bit around the Internet about the “second-generation ethnic policies” of China, which can be quickly found through a Google search.
David Marjanović says

August 6, 2024 at 5:19 pm

Pissily and rudely, for no reason.

I think he saw I was criticizing a few selected parts of his post, thought I was criticizing the post as a whole or even him as a whole because apparently normal people think in such ways, correctly saw I wasn’t offering any critique of the whole thing (let alone of him as a person), and got pissy for that reason.

I don’t begin my peer reviews with the exact words “yes, but”, I make the introduction longer; still, how common is peer review in sinology? Are the SPP peer-reviewed…?

it later became apparent that the Chinese wanted to retitle the exhibition “China’s Steppe Empire”. The narrative that Genghis Khan was “Chinese” (as part of the Zhonghua Minzu) is alive and well in China —

I’m not surprised by the second sentence; but the first? They were so isolated in their bubble that they thought this would fly in Europe?
AntC says

August 6, 2024 at 5:28 pm

Are the SPP peer-reviewed…?

Not by Mair. (Indeed in some cases I wonder if he’s even read them.) Clearly some authors get their Ps reviewed before publishing at SPP. Others not so much.

This is sad for the diligent authors: SPP is of such variable quality, it’s too easy to dismiss anything published there.
Bathrobe says

August 6, 2024 at 7:58 pm

I only recently (around June) learned that the Chinese wanted to retitle the exhibition “China’s Steppe Empire”, but now I’m unable to find the source. Google doesn’t help. I’m pretty sure I haven’t misremembered it. It’s a pretty breathtaking emendation but I wouldn’t put it past Xi Jinping, who seems to me to the ultimate Han chauvinist. I think that’s one reason Mair loves to pick up evidence of Xi Jinping’s illiteracy in Chinese. (After all, he was banished to the loess plateau in his youth so his education was sadly lacking, although those dusty days don’t seem to have diminished his enthusiasm for the glories of Han Chinese civilisation.)

I started living in China in 1993 and, hearing the term Zhonghua Minzu, assumed that it has always been part of China’s official vocabulary. I was surprised to discover, during my searches, that the term was actually coined by Liang Qichao in 1902 and much beloved of the KMT but was expunged by the Communists, especially Mao, who rejected Han chauvinism. It was only revived by the ethnologist Fei Xiaotong in 1988 and has now become central to Xi’s thought and policies.

(See also this paper: https://www.fiia.fi/wp-content/uploads/2019/04/bp260_sinification_of_china.pdf)
David Marjanović says

August 7, 2024 at 12:16 pm

Pissily and rudely, for no reason.

And then it got worse… AntC asked a few direct, incredulous questions, and instead of answering any of them, Mair helpfully informed the world that this was “ostentatiously otiose”. He seems to believe he’s surrounded by trolls!
languagehat says

August 7, 2024 at 1:15 pm

That’s obviously how he feels. He’s an extreme example of professorial poisoning — he’s so used to students drinking in his every word and never raising objections that he expects everyone to defer to him similarly. More to be pitied than censured, perhaps, but I censure him anyway.
Y says

August 7, 2024 at 2:17 pm

He did it again (on the same earlier post), reacting to DM’s interesting and clear excursion.

I’ve had professors like that. They would ruin any sort of academic attraction to their classes by insulting students whose intonation they didn’t like when they asked questions. Who wants to see a 19 y.o. be humiliated in public?
David Eddyshaw says

August 7, 2024 at 2:53 pm

He once simply deleted a comment of mine that he didn’t like, on account, I presume, of my having had the gall to point out (rightly) that the source he was using was very unreliable. (I blame myself for trying. I should really know better by now.)

The comment was then given its own post by Mark Liberman, and later undeleted in situ.

He seems completely unable to admit to either error or ignorance. It would be easier to take if he were less prone to error and ignorance himself. His atrocious manners are amusing, given his constant baseless attribution of rudeness to others.
David Marjanović says

August 7, 2024 at 3:33 pm

Or it’s a cultural difference, due perhaps to Mair having lived in “guess cultures” for decades.
Stu Clayton says

August 7, 2024 at 4:01 pm

Are you suggesting that he never learned how to guess accurately, after all those decades ?
AntC says

August 7, 2024 at 4:53 pm

DM AntC asked a few direct, incredulous questions,

Thank you David for answering my incredulity. In this case Mair is not in error or ignorance; I am. Never the less a straight answer giving the evidence would be appropriate (I’d already said the evidence is paywalled — if that’s where it is); not abusing my persistence.

surrounded by trolls

He often posts on tidbits I send him — usually wrt Taiwan. So he knows I’m not a troll(?) Or perhaps he thinks I suffer from split personality(?)

——–

And BTW, if the PIE for wasp/bee is unknown, how do we know the Tocharians had a word, and where did it come from?
Bathrobe says

August 7, 2024 at 5:48 pm

I think he is just getting old. Personally I think his heart is in the right place but unfortunately his crotchetiness is increasingly detracting from his credibility. We need his scholarship and sharp insights on China, just not delivered in such a cranky (in the sense of petulant and annoyed) way. He is just alienating fans and providing grist for the mill of those (particularly the Chinese government) who would like to sideline him entirely. It’s easy to do when he acts like this. FWIW, I’ve always found him easy to deal with. What I find annoying about LL is the stodgy Americanness of it all, including the commentators. There seems (to my sensibility) to be a stolid wall of conventional and boring academic thinking and earnest Americanness in the contributions and comments. It’s almost like an industrial construction site. More perspectives from non-American commenters would be nice. And that’s precisely what commenters from here provide.

(This purely my personal feeling. I’m sure many Hatters would disagree.)
AntC says

August 7, 2024 at 6:16 pm

the stodgy Americanness of it all, including the commentators.

emmm? the far-and-away stodgiest commentator is a Brit, as they frequently remind us. As to their academicness, I’m perplexed: an unevenly-learned amateur?
Bathrobe says

August 7, 2024 at 6:25 pm

The Brit guy is LL’s own Colonel Blimp. He is so eccentrically removed from stodgy Americanness that he is a joke.

(I didn’t realise the creator of Colonel Blimp was a Kiwi.)
David Eddyshaw says

August 7, 2024 at 6:40 pm

WRT to the “bee” issue on that LL thread:

Hatters will no doubt be pleased and relieved to hear that the word for “bee” is reconstructable in proto-Oti-Volta. That can scarcely have anything like the time-depth of PIE, though it does have some cognates farther afield in “Central Gur”, which is pretty much a baggy-monster group and probably is of near PIE-level time depth. It’s not related to anything in Bantu, though.

Difficult to say whether “bee” or “honey” is prior in “Gur”: proto-Oti-Volta *ʃɪ̀m-fʊ̀ “bee” is evidently derived from *ʃɪ̀-tʊ “honey”, but several non-Oti-Volta Gur languages just call honey “bee grease”, so who knows?
languagehat says

August 7, 2024 at 10:30 pm

Hatters will no doubt be pleased and relieved to hear that the word for “bee” is reconstructable in proto-Oti-Volta.

I for one had been losing sleep over it, so I thank you.
Trond Engen says

August 8, 2024 at 2:12 am

… and just as we share frustrations over Victor Mair, he posts this review: Magisterial German translation of a neglected monument of ancient Chinese literature, Mu Tianzi Zhuan.

First, a few words about the text, after which I will introduce the Sinologist who undertook this monumental philological task, Manfred W. Frühauf.

English:

The Mu Tianzi Zhuan, or Records of [King] Mu, the Son of Heaven, is considered to be the earliest and longest extant travelogue in Chinese literature. It describes the journeys of King Mu (r. 976-922 BC or 956-918 BC) of the Zhou Dynasty (c.1046-256 BC) to the farthest corners of his realm and beyond in the 10th century BC. Harnessing his famous eight noble steeds he visits distant clans and nations such as the Quanrong, Chiwu, and Jusou, exchanging gifts with all of them; he scales the awe-inspiring Kunlun mountains and meets with legendary Xiwangmu (“Queen Mother of the West”); he watches exotic animals, and he orders his men to mine huge quantities of precious jade for transport back to his capital. The travelogue ends with a detailed account of the mourning ceremonies during the burial of a favorite lady of the king.
Trond Engen says

August 8, 2024 at 5:51 pm

I forgot to add the link to the open access pdf, found at a link provided by commenter Andreas Johansson.
David Marjanović says

August 9, 2024 at 9:49 am

And BTW, if the PIE for wasp/bee is unknown, how do we know the Tocharians had a word, and where did it come from?

Why “wasp”? That’s a separate issue, and unrelated to honey.

Anyway, here in the sidebar there is “A Dictionary of Tocharian B (with etymologies)”. “Bee” is in it:

kro(n)kśe* (n.) ‘bee’
[-, -, kro(n)kś//-, kro(n)kśäṃts, -] krokśäṃts weśeñña māka ‘the sound of many bees’ (571b4), mäkte kroṅśaṃts cäñcarñe pyāpyai warssi ‘as [it is] the pleasure of bees to smell a flower’ (S-5a2). ∎Though obviously related to TchA kronkśe ‘id.,’ probably because the A word has been borrowed from B, the etymology is otherwise uncertain.

Then follows a whole paragraph of etymological proposals (one of them from a PIE word that would have meant “yellowish ~ golden ~ brownish” and could also be ancestral to honey), followed again by “uncertain.”

There is no proposal that Tocharian and Chinese might have exchanged a word for “bee”; it’s just “honey”.
AntC says

August 9, 2024 at 5:29 pm

Thank you @DM, ok I’m being dumb.

Why “wasp”?

Because Chinese ‘Feng’ 蜂 covers both wasps and bees. Wasps are yellow-feng 黃蜂, bees are honey-feng 蜜蜂. If it took the Tocharians to introduce them to honey, what was the term for bees before that? And if the Chinese already knew about bees, surely they knew about honey? (I’m still incredulous — given that Chinese eat everything and anything. Surely they noticed bears helping themselves at hives?)

it’s just “honey”.

Just one word? (and a very short word at that) Then why is Mair being all supercilious? I’d expect a whole vocabulary to do with beekeeping, wax, comb/cell, equipment to extract and strain the honey …

Or … Evidence that Old Chinese already had a word for honey and ‘mi’ supplanted it. IOW to rule out the possibility that the Chinese discovered honey for themselves; invented a word for it; that coincidentally sounded like the Tocharian.
Trond Engen says

August 9, 2024 at 7:49 pm

Here’s my take on the Tocharians.

(Sorry for all the names of cultures. Sorry I can’t link them all. And sorry that even with all these names, I’ve emitted a lot.)

It seems clear that Afanasievo had long distance trade with both BMAC and the Neolithic Yangshao culture (c. 5000-3000 BCE) of the Yellow River. When Minusinsk Basin Afanasievo dissolved, (pre-Proto-)Tocharians kept their pastures in the Mongolian Plateau and the Dzungarian Basin as the Chemurchek culture (c. 2750-1900 BCE). Afanasievo and Chemurchek may have been separate entities to begin with, but they eventually merged. Chemurchek kept regular contacts with the Okunevo in the Minusinsk Basin, BMAC, and the successors of Yangshao, the Tibeto-Burman farmers of the Majiayao culture (c. 3300-2000 BCE) of the Upper Yellow River and the Upper Longshan (c. 3000-2000 BCE) on the Loess Plateau. The proto-cities of Shimao and Taosi belong to the later phase of the Longshan.

The Chemurchek mediated Bronze Age innovations, helping to give the Shaanxi region the military and technological edge that would turn it into a center for unification of the Chinese heartland, starting already in the early second millennium BCE.

Maybe pressed eastwards by the Andronovo, many of the Chemurchek settled in the Hexi(/Gansu) Corridor and formed the Xichengyi culture (c. 2000-1600 BCE), probably with a strong local substrate of old trading partners from the Majiayao. The Xichengyi interacted closely with Upper/Middle Yellow River cultures like the Qijia culture (c. 2200-1600) of the eastern Gansu and the Erlitou culture (identified with the semi-mythological first Chinese dynasties),

The Chemurcheks on the Mongolian Plain may have established contact with the newly formed Seima-Turbino groups in the northern Altai foothills, learned wagon technology from the Andronovo, and transformed into the Munkhkhairkhan culture (c. 1800-1600 BCE) of the Western Mongolian Plateau. As long distance traders east of the Andronovo, they might have found a niche trading directly with the emerging Chinese state, but eventually they were forced out or assimilated by the Saka, who kept ruling the Western Mongolian Plateau until the formation of the Xiongnu Empire in the 3rd C. BCE.

The settled/semi-nomadic Xichengyi became the Siba culture (c. 1600-1300 BCE) who gave room to two independent but related cultures in the Hexi corridor, Shanma (c. 900-200 BCE) in west, and Shajing (c. 800-100 BCE) in east (I’m not sure what to think about the hiatus in the archaeological dates, but it ends about the time of the Mu Tianzi Shuan). Both cultures were heavily influenced by the Saka, who now dominated the eastern steppe. The location and the end dates fit well with the Xiongnu and Han conquests of the Hexi Corridor and the expulsion of the Yuezhi and the Wusun. The Yuezhi were the first to leave, so they may have been the Shanma. I think both were Tocharian, but I have no clear idea which was A and which was B.

(An alternative story is that the “Saka” of the Western Plain were the Yuezhi, and that one or both of Shanma and Shajing were the Wusun, but I think this is less likely.)

(The formation of the Chemurchek is obscure. It seems to have involved some very different populations.)

Much of this will no doubt be solved — and complicated — by genetics. As for now, the genetic data from China are to sparse to tell.
Trond Engen says

August 9, 2024 at 8:17 pm

@AntC: Both trade and imported prestige culture may lead to replacement of good native words.

There’s nothing impressive about a two-phoneme word like mi, but it’s part of a set of suspected IE loans related to Bronze Age prestige culture. It’s reconstructed as Old Chinese *mit, very similar to attested Toch A mit. Old Chinese was probably the language of the first Chinese dynastic states, i.e. of the Shanxi and Shaanxi in the early second C. BCE, just when the Tocharians were mediating prestige goods and culture from what would become the Silk Road.
AntC says

August 9, 2024 at 10:22 pm

@Trond It’s reconstructed as Old Chinese *mit,

Yes, I realise there was a coda, so we’ve more than two phonemes. Still hardly impressive for a soundalike. What’s “prestige” about honey? I guess the Chinese had sugar cane for sweetening: “S. sinense was a primary cultigen of the Austronesian peoples. … at least 5,500 BP”

Chris Button chipped in on the LLog thread, so that’s more substantive; nevertheless, this alleged ‘gold standard’ etymon is from the same Prof Mair who thinks five-phoneme ‘kumar(a)’ soundalike is absolute evidence in the absence of any other vocab nor any anthropological artefacts whatsoever. (I’m not denying there was contact and cultural exchange between Tocharians and OC; merely I’m asking why this soundalike is of especial mention.)

… replacement of good native words.

Is there evidence of a native word that got replaced? Is there evidence or absence of evidence of Chinese native beekeeping/honey-gathering before Tocharian contact? Or are people just making stuff up, like the miraculous voyage that brought kumara to Pacifica?
Ryan says

August 10, 2024 at 12:46 am

Ah, the old strategy of tying your current argument to one you previously lost.
Stu Clayton says

August 10, 2024 at 1:44 am

I always tie my arguments to each other. It’s called consistency. Winning and losing are for people who seek instant gratification. I’m in it for the long haul.
Trond Engen says

August 10, 2024 at 4:53 am

Me: The Chemurchek mediated Bronze Age innovations, helping to give the Shaanxi region the military and technological edge that would turn it into a center for unification of the Chinese heartland, starting already in the early second millennium BCE.

Maybe pressed eastwards by the Andronovo, many of the Chemurchek settled in the Hexi(/Gansu) Corridor and formed the Xichengyi culture (c. 2000-1600 BCE)

There’s another possible — or an additional — story buried between those paragraphs. The Chemurchek people settled in the whole Gansu/Shaanxi region around 2000 CE, maybe as specialist soldiers with bronze weapons, maybe also bringing in bronze workers from the Altai or the Minusinsk Basin. In Shaanxi they were assimilated to what would be the ruling elite of the Chinese state. In the Xichengyi culture of upper Gansu they would form stable societies for two millennia. In the region in-between, the Qijia culture would become the Rong peoples that the first Chinese dynasties were at constant war with.
Trond Engen says

August 10, 2024 at 5:04 am

@AntC: What “gold standard”? As I said, the honey word isn’t impressive in itself, but it’s part of a semantic field with several suggested Tocharian loans. But I think it’s fair to say that most are tentative, not at least because the reconstruction of Old Chinese is so shaky.
Trond Engen says

August 10, 2024 at 5:12 am

Oh, here’s an interesting paper!

Rasmus G. Bjørn: Indo-European loanwords and exchange in Bronze Age Central and East Asia
Six new perspectives on prehistoric exchange in the Eastern Steppe Zone, CUP 2022.

Abstract

Loanword analysis is a unique contribution of historical linguistics to our understanding of prehistoric cultural interfaces. As language reflects the lives of its speakers, the substantiation of loanwords draws on the composite evidence from linguistic as well as auxiliary data from archaeology and genetics through triangulation. The Bronze Age of Central Asia is in principle linguistically mute, but a host of recent independent observations that tie languages, cultures and genetics together in various ways invites a comprehensive reassessment of six highly diagnostic loanwords (‘seven’, ‘name/fame’, ‘sister-in-law’, ‘honey’, ‘metal’ and ‘horse’) that are associated with the Bronze Age. Moreover, they are shared between Indo-European, Uralic, Turkic and sometimes Old Chinese. The successful identification of the interfaces for these loanwords can help settle longstanding debates on languages, migrations and the items themselves. Each item is analysed using the comparative method with reference to the archaeological record to assess the plausibility of a transfer. I argue that the six items can be dated to have entered Central and East Asian languages from immigrant Indo-European languages spoken in the Afanasievo and Andronovo cultures, including a novel source for the ‘horse’ in Old Chinese.

Introduction

Linguistic contacts between Indo-European and Central and East Asian language families constitute a recurring topic of discussion in historical linguistics, but hardly any consensus on the earliest transfers has been established. Proponents point to similarities and cultural justification (Lubotsky & Starostin, 2003; Napol’skikh, 2001; Helimski, 2001; Pulleyblank, 1966; 1996), while critics note that the proposed transfers cannot be fitted into the known language interfaces (Simon, 2020). Yet can the concrete dating from archaeology and genetics be used to calibrate the relative chronologies of comparative linguistics? And if so, do loanword studies have anything to add to the prehistory of Central Asia?

Since the scientific breakthrough of comparative linguistics roughly 200 years ago, innumerable loanwords have been suggested between Indo-European and Uralic (Carpelan et al., 2001; Simon, 2020; Joki, 1973; Collinder, 1965). Less attention has been given to the possible connections with Turkic (Róna-Tas, 1974; Dybo, 2014; Lubotsky & Starostin, 2003), while contacts with early Chinese civilisation have been a recurring fascination of scholars (Pulleyblank, 1966, 1996; Lubotsky & Starostin, 2003; Blažek & Schwartz, 2017). The lack of consensus is in part due to unsettled chronologies within all language families. It is in this regard of primary importance that Uralic and Turkic have long been considered to belong to a shared linguistic area, commonly known as Ural–Altaic (Janhunen, 2001, 2009; Georg, 2017; Róna-Tas, 1983: 14; Vajda, 2020). In this article, I will present some of the most cited and culturally significant suggested borrowings with updated circumstantial data that may strengthen or reshape previous hypotheses based on lexical evidence alone. This limited set of highly diagnostic correspondences can then be used as a proxy for further dating of contact situations. I will argue that a set of early Indo-European loanwords into Uralic, Turkic and Chinese is of interest not only to historical linguists, but also to anyone wishing to understand Bronze Age interfaces in Central Asia.

The maps look like they might shake the foundations of my story above. I’ll have to read it!
AntC says

August 10, 2024 at 7:17 am

What “gold standard”?

(I’d better start by apologising to our host for leading ‘The Tocharian trek’ even further off course.)

I’m quoting Prof Mair; in which ‘honey’/’mi’ is given as the first/most secure piece of evidence.

Thank you very much for that Bjørn 2022 paper. I see a lot of handwaving: OC *m(j)it is so close to some early IE language, one of them must be the source, though we can’t nail it down, and all of the known ones require some slightly odd sound changes.

This item is the most difficult to tie directly to demonstrable innovations in Central Asia. The obvious Indo-European provenance …
… the word must have transferred in the Afanasievo phase, but since there is no real evidence for this, …
The borrowing into Old Chinese is fairly securely dated to the last half of the first millennium BC, but the direct source language cannot be identified.
[The Meier & Peyrot 2017 paper @DM ref’d gets mentioned as _supporting_ “proper explanation is still wanting”]

As I understand it, that date is far too early to be through Tocharian contact. Since (contra Ryan) I am eager to be lead by evidence, we can compromise that the word is maybe cognate with some IE-or-the-other, but probably not Tocharian and certainly not reaching ‘gold standard’ as Tocharian. The date would line up with Zhou Dynasty:

In general, bees and wasps are called Feng in Chinese, and honey bees are called Mi-feng. The earliest written record of bees is the Chinese character Feng in ancient inscriptions on animal bones dating back 3000 years (Zhou Yau 1990). Later in the Zhou Dynasty (around 300 BCE), the Chinese character Mi, meaning honey, was recorded in the Book of Manner, Li Ji, as a dietary recommendation (1993).
… beeswax was harvested and made into candles (Mi-zhu) and given in offerings to the first Han Emperor (206-195 BCE).
Ancient Chinese Apiculture, Lau 2012

OTOH there’s maybe reason to doubt that paper[**]: it’s the same author/around the same date/same material (but more clearly organised IMO) as what I already cited at LLog. It illustrates [photo 3, which is some but not all of ‘cute’ photo 5 in that paper] “a series of characters suggested to be “feng” (note the stinger)” from oracle scripts, already pooh-poohed by Christ Button on the LLog thread, on grounds those characters are for locust. (But this version of the paper clearly acknowledges those other illustrations are locusts.) Counter-Chris note the illustrated characters clearly have something protruding from their rear parts, which the locust illustration Photo 2 does not. Locusts don’t sting.

If those oracle characters represent wasps/bees at 3,000 BP, we still have a huge gap not mentioning honey until Zhou dynasty. Too bucolic to merit recording in Imperial annals?

[**] I won’t test your patience with ‘Apiculture in China’ 2015 but note the specific date, note the Prof’s alleged name!
Lameen says

August 10, 2024 at 7:57 am

There is a big difference between gathering wild honey and systematic beekeeping. Evidence of familiarity with bees and with honey (which I would expect to be practically universal) is compatible with either; the transition to beekeeping, on the other hand, would be a great moment to borrow associated vocab. We might even expect to find cases where the old word for honey lingers on in reference to special wild honey, while the new one becomes the default for honey as a trade good.
Trond Engen says

August 10, 2024 at 10:34 am

Me: I must read the paper!

Done. I’ve even thought about it for a couple of hours while chopping wood in my backyard.

The paper is a list of six supposed wanderworts and an attempt to explain how each of them got from Indo-European and into different language families. Since each word got there in its own way, with its own gaps of unknowns to bridge, everything is essentially special pleading. It does substantiate the claim of wanderwort status, but adds little to the understanding of actual cultural transmission beyond what was already known from archaeology. That’s unfortunately how loanword studies must be, until patterns start to emerge from the special pleadings.

What I really would want to see is a treatment of the lexicon of Chinese/Sinitic/Sino-Tibetan metallurgy and animal husbandry. Chinese bronze metallurgy started around 2000 BCE. It’s widely accepted that it was transmitted from Inner Eurasia to the Upper Yellow River. Sheep and cattle breeding started in the Gansu-Qinghai region around 2000 BCE.

This means that archaeology alone tells about a major cultural change in the region of the middle/upper Yellow River. Archaeology also tells about cultural continuity with the preceding cultures of the Yellow River Neolithic. which should make us assume that it wasn’t all about massive population replacement, but about arriving groups settling among the native population (by any mechanism on a scale from slave import to foreign conquest) and spurring synthesis cultures and cultural exchange. It’s highly likely that the newcomers brought the technical vocabulary with them and transmitted much of it to the native populations.

So what languages would Central Eurasian cattle breeding metallurgists of 2000 BCE speak? Very likely Indo-European, but it’s just a tad too early for Indo-Iranian. The Afanasievo are known to be Indo-European by culture and genes. They lived as settlers in the Minusinsk Basin and as sheep and cattle breeders in the mountains and steppes to the south. They are known to be founding members of the bronze-producing Okunevo culture of the Minusinsk Basin and of the nomadic Chemurchek culture of the Dzungarian Basin and the Western Mongolian Plateau. They are also known to have disappeared from the eastern steppe when the Iranians expanded. Thus an Afanasievo hypothesis for a Chinese adstrate. That doesn’t make it Tocharian, but we also know that two named peoples (one or both) likely to be the immediate ancestors of the Tocharians show up in Chinese written history in the Hexi Corridor in the last millennium BCE, and that these can be identified with archaeological cultures, with reasonable cultural continuity inbetween.

Thus, there’s all reason to expect an early 2nd millennium BCE adstrate in Chinese, and there’s pretty good reason to expect that it was some sort of archaic Indo-European. There’s also good reason to expect later adstrates, maybe especially horse and iron vocabulary from Iranian — but maybe also from Turkic.

@AntC: If the “honey” word was part of one of the probable Indo-European adstrates is not of great importance, but it would be bad historical linguistics not to take a plausible Indo-European source into consideration.

Another take-out of the Bjørn paper is that it’s apparently mainstream to suggest that Okunevo was Proto-Uralic. That was a message I hadn’t received, but it’s compatible with substrate hypotheses (1) and (2) and the corresponding adstrate hypothesis (A1) as well as (A3).
Trond Engen says

August 10, 2024 at 3:42 pm

Me: Me: I must read the paper!

Misquoting myself. Seems trustworthy.

Me (further up). I haven’t actually read the Shimao paper yet. I had a brief look, decided I needed a better grip of the context, started reading about the cultures of the loess plateau, and realized the relevance for the Afanasievans..

That’s what I’ve been doing until now. So how does “the Shimao paper” fit in?

Hua Wang et al: Craft Specialization in the Highland Longshan Society: Perspective from the Bone Needle Workshop on the Central Mound at the Shimao Site, Shaanxi, China, Journal of Anthropological Archaeology 2024

Highlights

• Prehistoric economy at Shimao.
• Archaeology of craft specialization.
• The chaîne opératoire of bone needle production.
• Transition from Late Neolithic to Bronze Age economy in East Asia.

Abstract

The emergence of Shimao, a proto-urban center at the contact zone between agropastoral communities of the Loess Plateau and herders/hunter-gatherers of Monogolian Plateau, offers critical insights into the economic activities during the transition to the Bronze Age in continental East Asia. Unprecedented in scale in prehistoric China, the bone needle workshop at the central mound was a prelude to the specialized, industrial-scale bone production workshops seen in the Bronze Age cities of Zhengzhou, Anyang, and Zhouyuan during the second and early first millennium BCE. The bone needle production at Huangchengtai probably supplied a sophisticated craft industry for the production of garments using animal hides and textiles.

Shimao was a fortified proto-urban center at the northern edge of the loess plateau, bordering on the Mongolian Plateau. It was built around 2200 BCE and reached its peak around 2000-1900 BCE. It was a ritual center with an impressive temple mound and large scale human sacrifice. The top of the temple mound was also a specialized production center where bones (mostly from sheep) were turned into needles for garment production. The economy of the production is not understood. It was obviously under central command by the religious authorities, but there’s little that says that this actually was a commercial enterprise serving a larger region with needles and bone tools. The city must also have been a center for production of leather and leather garments, but no tanneries and sewing workshops have been found yet.

That doesn’t mean it didn’t happen. With the formation of the first Chinese states (semi-mythical dynasties), needle production was centralized in large workshops under royal authority. This is a direct continuation of an industry and a mode of organization that started at Shimao, at a main interaction point between nomads and settled farmers.

What does this have to tell about the Tocharians? It tells that the technological and cultural influence was much more than just copying fashions from neighbours. In the Upper Longshan (Chinese Neolithic), very early in the 2nd millennium BCE, products stemming from the steppe were deeply integrated into the economy and central elements of ritual life — just as the Chinese state system was born.
David Marjanović says

August 10, 2024 at 7:48 pm

Because Chinese ‘Feng’ 蜂 covers both wasps and bees. Wasps are yellow-feng 黃蜂, bees are honey-feng 蜜蜂. If it took the Tocharians to introduce them to honey, what was the term for bees before that?

Most likely just feng.

I’m still incredulous — given that Chinese eat everything and anything

The periodic famines in southern China are a much, much more recent phenomenon. I’m afraid your incredulity is entirely anachronistic.

More later, I hope.
Ryan says

August 11, 2024 at 12:05 am

Trond,

What exactly do they mean by highly diagnostic loanwords associated with the Bronze Age? Diagnostic of what? And how do any of those characteristics apply to ‘seven’?

I’m also baffled by this passage:

> The phonology and extensive agglutinative nominal system of Tocharian have a deviant typology and have played a pivotal role in charting its development (Peyrot, Reference Peyrot, Kloekhorst and Pronk2019b; Warries, Reference Warries2019; Bednarczuk, Reference Bednarczuk2015 ; Krause, Reference Krause1951); a similar change is reflected in the culture that appears as Buddhist, although the conversion can only be few centuries old .

This seems like apples and Volkswagens. What am I missing?
Ryan says

August 11, 2024 at 1:10 am

Starting to think “a similar change” means nothing more than “change that happened somewhere in the same multi-millennium period
Trond Engen says

August 11, 2024 at 6:12 am

Ryan: What exactly do they mean by highly diagnostic loanwords associated with the Bronze Age? Diagnostic of what? And how do any of those characteristics apply to ‘seven’?

I read that as loanwords denoting objects or concepts that spread with the culture and economy of the Bronze Age. It’s suggested that one of those concepts was a standardized system for counting high numbers as we know it today. The word for “seven” might have spread with the system, also into Indo-European from “a source akin to” Semitic. I don’t know the reason, but a simple suggestion is that the former counting base was five or six, and when the Indo-Europeans started trading with people that counted longer, they adopted their simple word for one of the most used numbers above. Same story when Indo-Europeans traded with other peoples further east.

By sheer coincidence, this paper is just out:

Ialongo & Lago: Consumption patterns in prehistoric Europe are consistent with modern economic behaviour, Nature Human Behavior 2024

Abstract

Have humans always sold and purchased things? This seemingly trivial question exposes one of the most conspicuous blind spots in our understanding of cultural evolution: the emergence of what we perceive today as ‘modern’ economic behaviour. Here we test the hypothesis that consumption patterns in prehistoric Europe (around 2300–800 bce) can be explained by standard economic theory, predicting that everyday expenses are log-normally distributed and correlated to supply, demand and income. On the basis of a large database of metal objects spanning northern and southern Europe (n = 23,711), we identify metal fragments as money, address them as proxies of consumption and observe that, starting around 1500 bce, their mass values become log-normally distributed. We simulate two alternative scenarios and show that: (1) random behaviour cannot produce the distributions observed in the archaeological data and (2) modern economic behaviour provides the best-fitting model for prehistoric consumption.

The full article is unfortunately paywalled, but here’s the article in Videnskab.dk that made me aware of it, including rave comments by Kristian Kristiansen. On that caveat, the main finds are:

– Metal fragments — essentially all metal fragments in all Europe — were cut up in pieces on a standard base of 10g.

– The size distribution of the fragments follow a log-normal curve that should be expected when the pieces are money, and the size of the deposits show a distribution that should be expected when metal money is used by all layers of society.

I’m starting to think that the explosive trans-continental spread of metallurgy and mining is a direct result of this system, in a feed-back loop where increased access to metals made the system all-encompassing by the high point of the Bronze Age. Maybe the proto-Indo-Europeans sat in a lucky spot at the right time, just as the Spaniards did with New World gold in the 16th C, or maybe they sat close by and could (strive to) monopolize the new global trade, like the Dutch or the British.

Ryan (again): apples to Volkswagens

Yes. I read right past it, thinking it meant that Tocharian society developed independently of the Andronovo/Scythian sphere, but now I’m not sure. When I write something like that myself, it’s usually because I’ve edited it to pieces.
Trond Engen says

August 11, 2024 at 6:39 am

(Too late for the edit window)

I forgot to say that Ialongo & Lago have worked on this for some time. What is new now is the thorough documentation by statistics.

The 10g unit is an adaptation of the Mesopotamian shekel. We have touched on that before in other contexts.

A standardized money system would promote a standardized counting system.
Stu Clayton says

August 11, 2024 at 7:33 am

I think “promote” is an inappropriate word to describe the relationship, because it posits a temporal or conceptual order. It seems to me that the one arises simultaneously with the other – as “standardization” becomes a thing.

A standardized money system is merely one kind of standardized counting system. It’s a good thing that the British money counting system with pounds/shillings etc didn’t “promote” counting everything that way. Otherwise hostesses would have a hard time figuring out if they had laid enough plates for the expected guests.
Trond Engen says

August 11, 2024 at 8:44 am

Everything is a differential equation. Everything that lasts is the result of a feedback loop.
Stu Clayton says

August 11, 2024 at 10:31 am

Rocks too ?
Trond Engen says

August 11, 2024 at 11:41 am

In a metal money system, the regions that end up with all the metal are those that produce a surplus of other goods and don’t need (or manage) to use it to finance the import of other useful goods. In early modern world trade, the gold from America ended up in South and East Asia, where it made temples and palaces shine without the average South and East Asian being much better off for it. In the Bronze Age, the metal seems to have ended up in peripheral places like Scandinavia and Sichuan.
Y says

August 11, 2024 at 2:03 pm

It’s very tempting to link numeracy and trade. However, there are large areas, like Australia and Amazonia, where languages where no high numerals exist, and where trade surely existed as much as anywhere. Is there an in-depth study of this issue somewhere?
Y says

August 11, 2024 at 2:11 pm

A prepublication version of Ialongo and Lago’s paper is here. Ialongo has a bunch of other papers in that vein.
David Eddyshaw says

August 11, 2024 at 2:25 pm

Of course, you don’t necessarily need number words to count reliably.
Australian languages famously have few number words, but I gather that there are often complex finger-counting schemes available.

The idea that you have to have words to think coherently is one of those hardy linguistic delusions.
Trond Engen says

August 11, 2024 at 2:30 pm

I haven’t seen any studies on this, but then I haven’t really thought about the Bronze Age as the Money Age before yesterday night.

What I have seen – and accepted – is that numeral systems develop when people need them. In the Eurasian Bronze Age, the linguistic innovation could be the standardized counting system rather than just counting, and it could be driven by the proto-monetary system rather than trade in itself. But now that I think more about it, it could even be managing credits and (proto-)bookkeeping. I really wonder how they did that without quipu.
Trond Engen says

August 11, 2024 at 2:55 pm

Oh, I don’t expect Ialongo & Lago to have anything to say on the hypothetical spread of higher numerals in the Bronze Age. Their work is archaeological and on metal fragments as money. It’s me who should be blamed for putting the two together.

@Y: Thanks for the link. I’m going to follow up on it, but maybe not today.
Y says

August 11, 2024 at 3:33 pm

I got to think about how in Southern California, while everyone had numeral words, they were a lot more prone to change and borrowing than recent European ones, and I wonder if that had any connection with the spread of the Channel Island shell bead trade.

@DE: Not all trade is based on counting. You can trade an armful of this for a handful of that, rather than five of this for a dozen of that. Anyway, somewhere there must be surveys of trade systems from many different angles. Anthropologists (especially of the older schools) love that kind of thing.
Dmitry Pruss says

August 11, 2024 at 3:39 pm

Re: Australia and counting without words: If trade is just an exchange of two or thee commodities then it’s just as easy to trade using barter equivalency of standardized units of measurement, like a bushel per crate, a pile for a bunch. It’s only when the variety of the trade goods becomes too big and their value is understood to be numeric (like smaller and larger metal objects with values linked to weight), then the barrel-per-crate rules grow too complex, and counting and money becomes practical.
Trond Engen says

August 11, 2024 at 4:11 pm

I imagine that natural common currency systems like shell beads and cowrie shells and squirrel hides are less likely to induce standardization of the counting base for higher numerals, unless they too are managed in standard bundles of different sizes.
David Eddyshaw says

August 11, 2024 at 4:46 pm

in Southern California, while everyone had numeral words, they were a lot more prone to change and borrowing than recent European ones

Proto-Oti-Volta certainly had numbers up to the thousands, but once you start wider comparison things get a lot more limited. With “Central Gur” you can probably get up to “ten”, with a bit of imagination, but that’s about it.

As far as proto-Volta-Congo goes, “two, three, four, five” seems to be it. (“One” varies a lot, but that’s not too surprising.)

There are a lot of base-five system in West Africa, i.e. ones with “five, five-and-one, five-and-two, five-and-three, five-and-four, ten.” On the other hand “three” and “four” look like cognates all over, even in languages which show precious little other evidence of being related. But then again, *naa “four” is just the kind of shape you’d expect to prove pretty durable over the millennia.

Yélî Dnye has borrowed all its number words from Austronesian (the more remarkable, as it seems to have borrowed very little else.) The Rossel Islanders are apparently famous in the anthropological literature for their highly complex traditional money system. Something to do in the evenings when you’re not working on how to make your language even more complicated, I suppose. (You had to make your own entertainment before television came along.)
jack morava says

August 11, 2024 at 8:08 pm

In the middle of the night I sometimes wonder who invented multiplication, and FWIW suppose it might have been the Indus Valley folks and their friends in Mergarh who provided them with bricks.
Stu Clayton says

August 11, 2024 at 8:38 pm

I believe multiplication was invented in the bible, long before bricks: “Be fruitful, and multiply, and replenish the earth, and subdue it”.
Trond Engen says

August 12, 2024 at 5:28 pm

I’ve read the Ialongo & Lago paper. I didn’t have the willpower to try to understand the statistics, so for whatever it’s worth: Things are less clear than they seemed, but there are still things to learn.

The main thing to learn is actually that there’s a clear and apparently sudden change in the composition of metal deposits after the introduction of the scale weight (Italy c. 2000 BCE, Central Europe c. 1500 BCE, “Atlantic sphere” c. 1200 BCE). Before the scale weight, there’s mostly complete objects and only few fragments, and there’s no system in the sizes of the fragments. After its introduction, there’s mostly fragments, with most weights distributed narrowly around 10g, and blips in the curve for multiples of 10g. Complete objects show no systematic distribution, before or after the scale weight, so if they’re standardized it’s on something other than weight. Scale weights similarly are narrowly distributed around 10g (except for those that are made for significantly larger weights).

What do I think about this?

My bright idea that metallurgy could have spread because of the demand for metals for transaction doesn’t look so bright anymore. The scale weight was invented in Mesopotamia in the late 3rd millennium and spread slowly in Europe from 2000 BCE, so it’s well after the onset of the Bronze Age. I don’t expect it to have reached Inner Eurasia much earlier.

I find it convincing that bronze objects were cut down to standard pieces that were used for buying and selling goods. I don’t think the fragments primarily were counted as the main check in the transaction. That came with the officially stamped standard tokens called mint or coins*. The fragments were weighed, but the quasi-standard fragments of 10g made it easier to get to roughly the right weight before adding smaller chips. There were probably strategies to do transactions without a scale weight, and they would also be easier starting with standard fragments.

The fact that the composition of the hoards changed with the introduction of the scale weight doesn’t mean that the reason for owning the metal objects suddenly changed. I assume they were kept for their usefulness in trade also before the scale weight, but the operations in trade were different, since the concept of a standard weight didn’t exist.

So what about the idea that a standard base for counting high numbers spread with the concept of money? I don’t think it matters if the transaction units were still understood as weight units and the pieces of metal as a handy way to achieve that, Counting would still be part of the process, and maybe much more so in those transactions that were done without a weight and where both parts should agree on a value. But the introduction of the standard weight unit matters for the timing.

* I actually remember this kind of tokens being used in my youth.
Lars Mathiesen (he/him/his) says

August 13, 2024 at 8:29 am

To a mathematician, a number word is secondary to the element of the natural numbers that it names.

I once saw the idea that tally sticks came before larger number words; on the other hand, many humans are able to “see” numbers up to six or nine without explicit counting, like when pouring out seven tablets for my weekly medicine sorting exercise, and I wouldn’t be surprised if that kind of cardinality had names before people had tens of goats to keep track of. (Maybe not tablets, nuts or whatever you wanted to ration).
jack morava says

August 13, 2024 at 9:30 am

The Egyptians apparently had an elaborate system of units for grain distribution bureaucracy, cf eg

https://scripturecentral.org/archive/media/chart/egyptian-hieroglyphs-grain-measurement ,

and I read somewhere that in some Central American cultures (Aztecs?) used the cacao bean as a unit of currency; their astronomers – or maybe the Mayans – certainly knew how to count…
Trond Engen says

August 13, 2024 at 12:30 pm

As I said above, I don’t think the claim is that the number seven spread with the concept of counting but that the word spread with a standard system for counting in the Bronze Age.

America might actually be a nice comparison. How similar are the number systems of the major civilizations or cultural areas?
Hans says

August 13, 2024 at 8:34 pm

Maybe I misunderstood you, but Bronze Age is too late for the assumed exchange of the “7” word between PIE, Semitic, or wherever the word originated. And even for higher numbers – 100 can be safely reconstructed for PIE, and perhaps even 1000 – at least, *g’heslo- is a Greek – Indo-Iranian – Italic isogloss.
Ryan says

August 13, 2024 at 10:59 pm

Trond will answer for himself, but I took him to mean the possible spread across Eurasia that is the subject of the paper cited a few days ago.

It’s a pretty interesting question when and where Semitic might have given PIE the 7 word. It’s certainly enticing to think there’s a relationship. Hittite for seven is sipta. Assuming that’s the expected form for a word from the root that gave us Sanskrit and Greek hepta, rather than a simultaneous loan into HIttite and it’s sister clade, then the borrowing is awfully early. Could an ancestor or relative of Akkadian have been up in the Caucasus in the early 4th millennium BCE?

Kartvelian almost matches (the promising shvidi in Georgian but shk’viti in Megrelian and shkvit in Laz). Could ‘k’ be epenthetic? Or a regular development from a Semitic sibilant? Or is there a k in a reconstructed Semitic seven word? (No.)

Northeast and Northwest Caucasian seem to have a different etymon as does Sumerian (imin).

You can imagine some sort of trade route, but “imagine” seems all too accurate for the thought experiment required.

For that matter, Akkadian sabe doesn’t seem to account for the d/t sound shared by PIE and Kartvelian.
Hans says

August 13, 2024 at 11:40 pm

Another question is whether Semitic is the source or just another recipient of a wanderwort.
Y says

August 14, 2024 at 12:13 am

Are Hurrian šinta and šitta ‘7’ related? IDK.

See discussion in Björn, Foreign elements in the Proto-Indo-European vocabulary (here), p. 124. He thinks it’s Semitic, following Blažek.
David Eddyshaw says

August 14, 2024 at 4:46 am

Akkadian sabe doesn’t seem to account for the d/t sound shared by PIE and Kartvelian

Personally, I think all this is just the awesome power of sheer coincidence*, but if you were determined to see an origin in Semitic it wouldn’t be surprising if it went back to the form used with masculine nouns, which has the feminine ending -t Because Semitic.

Sumerian “seven” is formed from “five” plus “two”: the numbers work like the common West African setup that I mentioned. No doubt it all goes back to the Fulɓe conquest of Mesopotamia.

* Björn’s argument is all “probably” and “doubtless.” I note that he cites a source which attributes the Chinese form to the same origin (via Tocharian … someone tell Victor Mair. Though Chinese “ten” is obviously a loan from Fulfulde.)
AntC says

August 14, 2024 at 6:29 am

the awesome power of sheer coincidence*,

* Björn’s argument is all “probably” and “doubtless.”

Thank you for saying that out loud. Some of Bjōrn’s ‘substantiations’ appeal to even weaker evidence. (Eight is a dual for fist/closed hand because 4 fingers, for example.)

I get it that early IE is a lot of speculation. Is this why Mair can call *m(r)it ‘gold standard’?
languagehat says

August 14, 2024 at 8:37 am

Personally, I think all this is just the awesome power of sheer coincidence

Same here.
Ryan says

August 14, 2024 at 10:19 am

It’s a little hard to understand why one or two number words would be picked up and not others, particularly others numbers higher than the sequence that is borrowed. The idea that higher numbers were rarely used in a pre-trade society might be plausible, but why would IE borrow 7 and hold onto a pre-existing 8 etymon? That seems to prove that higher numbers were already useful and well known. You could maybe understand it if 7 was important in some way to the trade system or cognate with a measurement. Were there cultural meanings to 7 in Semitic cultures other than/before the Judaic creation story?

I’m aware that there are examples in other language families of what is explained as this kind of selective numeric borrowing. Is there ever a more detailed explanation of how and why that happens?

I am sympathetic to the awesome power of sheer coincidence. But that in turn would be more enticing if the etymons for the word 7 didn’t show such similarity across four language families — Semitic, IE, Kartvelian and Hurro-Urartian.

One is anecdote, two coincidence, three a pattern… But four? We’re nearing “God put the fossils there 6,000 years ago because He wanted to make it look like evolution” levels of explaining the evidence away. Maybe, but at four, you can understand why the pagans think there might be science rather than coincidence behind it.
David Eddyshaw says

August 14, 2024 at 10:45 am

Eight is a dual for fist/closed hand because 4 fingers, for example

Conceivably, proto-Oti-Volta *nii- “eight” actually is a sort of plural of *naa- “four.” The number words often turn up with what look like fossilised noun-class plural suffixes, not always present in every language, and three noun-class pairs had a plural suffix *-i which in fact does cause umlaut of preceding root /a/ to /i/ in some branches of the family (e.g. Kusaal naaf “cow”, plural niigi.)

I can’t think of any way of confirming or disproving this suggestion, though. I reckon it’s fated to remain forever on the list of “what an interesting thought!” (Or not, as you may think.)

Kusaal pisi “twenty”* looks at first sight like a plural-standing-for-two-exactly of piiga “ten”, but, disappointingly, it seems actually just to be from pis “tens” and yi “two” with a bit of smushing** (“thirty” is pis tan’ etc.)

Some Oti-Volta languages do have locally hand-crafted words for some of the numbers six to nine, with still-discernable origins in things like “one less than ten” and so forth, but nevertheless the whole series looks reconstructable to proto-Oti-Volta. But then, it’s not always possible to be sure one is dealing with actual cognates rather than old loans. Some of them might be proto-Algonquian-whisky words after all. (And the word for “money” seems to be really a loan from Western Oti-Volta in all the other branches, on morphological grounds, even though you can reconstruct a POV form for it without breaking any Neogrammarian rules. If counting does go with money historically, that could be a hint.)

* The identical Mooré word pisi “twenty” also means “hundred-franc piece”, because French West Africa.

** Technical phonological term. Goes back to Ladefoged, I believe.
Xerîb says

August 14, 2024 at 3:41 pm

Eight is a dual for fist/closed hand because 4 fingers, for example.

On ‘eight’ in Indo-European… Just for the curious among LH readers, an Avestan passage (Videvdat 13.30) in which the measure ašti- ‘breadth of four fingers extended’ (‘a palm’?) is used. Ahura Mazda is asked how Zoroastrians should deal with a dog that bites without barking or growling:

āat mraot̰ ahurō mazdā̊: auua hē baraiiən tāštəm dāuru upa tąm manaoϑrīm stamanəm hē aδāt̰ niiāzaiiən ašti.masō xraožduuahe biš aētauuatō varəduuahe

Thus spoke Ahura Mazdā: they shall place a cut piece of wood about its neck, they shall close shut its mouth with this, an ašti in length in case of a hard piece, twice that in length in case of a soft piece.

I am not sure how this muzzle would look exactly.
AntC says

August 14, 2024 at 4:23 pm

On ‘eight’ in Indo-European…

The cite there is to this same Bjørn. (Gets himself about a bit. Changing the diacritics on his ø don’t fool me.)

seems to appear
appears to have been borrowed

And _is_ it related to four?

*kʷetwor- has too many consonants to be a true primitive morpheme. Tentatively … of the shape …
[what does double-** signify in an unattested form?]

What they have in common is kʷ. Any more alike than tʷo cp thʷee?
Trond Engen says

August 14, 2024 at 5:11 pm

@Hans: Maybe I misunderstood you, but Bronze Age is too late for the assumed exchange of the “7” word between PIE, Semitic, or wherever the word originated.

Yes, sorry. As Ryan says, I meant the supposed transmission from IE to languages in Eastern Eurasia (of the “seven” word and/or the standard counting system).

David E.: Personally, I think all this is just the awesome power of sheer coincidence*

* Björn’s argument is all “probably” and “doubtless.”

For the record: I agree. Probably.

AntC: Some of Bjōrn’s ‘substantiations’ appeal to even weaker evidence. (Eight is a dual for fist/closed hand because 4 fingers, for example.)

“Substantiate” was my word. By that I meant that he took words that until now have been floating around looking enough alike that actual card-and-no-tin-foil-hat-carrying linguists have suggested that they are borrowings or wanderworts, and he tried to explain how they could actually be related. That adds some substance to the claims — something that others might work with, or at least will be able to evaluate. My own layman’s evaluation is that it’s all special pleading and adds little about the contact situation beyond what’s already known from archaeology. Still, I do think it’s worthwhile to start pulling the threads.

“Eight” as a dual of four is not his invention. The etymology of numbers is mostly internal reconstruction. That’s a risky game under the best of conditions, and much more so with number words, since they’re inadvertently prone to be adjusted to the rhythmic and phonetic environment in the “counting chant”. But again, it’s a worthwhile exercise. Sometimes things turn up that gain wide acceptance (like “hundred” <- "(ten) of tens").

Ryan: It’s a little hard to understand why one or two number words would be picked up and not others, particularly others numbers higher than the sequence that is borrowed. The idea that higher numbers were rarely used in a pre-trade society might be plausible, but why would IE borrow 7 and hold onto a pre-existing 8 etymon?

One is anecdote, two coincidence, three a pattern… But four? We’re nearing “God put the fossils there 6,000 years ago because He wanted to make it look like evolution” levels of explaining the evidence away. Maybe, but at four, you can understand why the pagans think there might be science rather than coincidence behind it.

“How could just that number possibly have spread” vs. “How could that areal similarity arise by chance”. It’s essentially two improbabilities stacked against eachother. It may be coincidence, but not the easy kind of coincidence, It needs being addressed, and readdressed, under new evidence and new methods.

(That said, some of the sevens could just as well be removed from the list of suspects for now. Bjørn’s Chinese was far from convincing.)
Ryan says

August 14, 2024 at 6:17 pm

Me: One is anecdote

Trond: “How could just that number possibly have spread” vs. “How could that areal similarity arise by chance”. It’s essentially two improbabilities stacked against each other.

Agreed. That’s why I put those in the same comment, but I think the sharper rhetoric of “God wanted to make it look that way” made my position seem less equivocal.

But things might be more interesting. Some reconstructions of proto-Semitic and proto-IE six are also pretty similar. And Wiktionary’s Hurro-Urartian word list gives šeše for six. It’s harder to see how Kartvelian six would fit.* But three languages with pretty similar 6 and 7 lexemes is interesting, and the Hurrians might be in the right place to connect early Semites** with the south Caucasus part of the genetic cline that was recently reported to be behind the early IE gene pool. If that’s a trade route… I can see how 1 through 5 and “two fours” might be more resilient lexemes than six and seven among hunter gatherers moving into a higher level economy.

— — — — —
* — The mere presence of s/š in Kartvelian six words isn’t meaningful. But someone with knowledge of the languages involved might have more to say – there is a pattern that suggests prefixing and an epenthetic k/g on a root that is something like si / tsi /ši. So maybe? I’m also using a Reddit version of numbers in four Kartvelian languages. I know less than nothing about the family myself and have no idea whether Reddit is accurate here.
** — Well, not that early unless we believe the original Semitic expansion came from Mesopotamia, but anyway.
Ryan says

August 14, 2024 at 6:25 pm

One might argue that the words for six are so short that coincidence is uninteresting.

But consider instead that the six/seven sequence across these 3 language families (maybe 4) has a string of five phonemes that one might reasonably say were linked. Could still be coincidence, but it’s much more remarkable, especially when considered in light of the South Caucasian cline and IE.
Ryan says

August 15, 2024 at 1:43 am

Here is an wildly interesting article from the excavator of the Hurrian capital Urkesh. He strings together what he concedes is a “fragile chain of inference”* to make the argument that the foundation of Urkesh was in the mid-fourth millennium, that it was already recognizably Hurrian, that it controlled a hinterland in the highlands that was also the Hurrian homeland. He connexts them to the Early Transcaucasian culture whose burnished ware spread from north of the Caucasus through Anatolia to Syria and Palestine.

This at minimum puts Hurrians ain the right place at the right time with the right scale of development to transmit a few trade-related words from the Amorites in the west or the Akkadians to the PIE people in the northern foothills of the Caucasus.

—-
Fragile, he confesses, but not that fragile. He is convincing on points that Urkesh is Hurrian by the early mid 3rd mill. That the city was established a thousand years rarlier. And that there are significant ritual continuities across that period that seem connected to Hurrian mythology and identity.
Jerry Friedman says

August 15, 2024 at 9:41 am

Does anyone ever try to estimate the probability of such resemblances arising by chance? I’m not saying it would be easy.

There was once a question at Language Log about whether there was any connection between Yiddish goy, Romany gajo, and Japanese gaijin. Victor Mair added Albanian huaj (Arbëresh goj) ‘foreigner’ and Swahili mgeni ‘foreigner’ (ugeni ‘strangeness, foreignness’). Chris Button added Cantonese gweilo. I’d add Latin gentes ‘foreign nations’, later ‘foreigners’ (Lewis’s Latin Dictionary for Schools), and related words. And I’d add the “gn” in “foreign”, obviously the second morpheme of a compound. Some of those are rather a stretch, but I think “gajo”, “gaijin”, “goj”, “-geni”, “gentes”, and “foreign” make a nice set.

Of course I’m distorting things—in particular, gentes apparently has many other meanings—but if you imagine that our knowledge of those languages consisted of just the right fragments, and that the languages were spoken closer to each other, I think you’d wonder whether it was just coincidence.

Edit: The “g” in “foreign” comes out of nowhere, maybe through the influence of “reign” and “sovereign”, according to etymonline.
David Eddyshaw says

August 15, 2024 at 10:20 am

Also Nateni cãnn “stranger.”

Actually, it would be nice if proto-Oti-Volta *cà̰n-wà “stranger” were actually cognate with the proto-Bantu *mʊ̀-gènì behind Swahili mgeni, but I can see no way of making it so. Apparently there is a *mʊ̀-jènyì variant in NW Bantu, but I don’t think it helps much.

Does anyone ever try to estimate the probability of such resemblances arising by chance? I’m not saying it would be easy.

Yes:

http://www.zompist.com/chance.htm
https://languagelog.ldc.upenn.edu/myl/Ringe1992.pdf
languagehat says

August 15, 2024 at 10:50 am

That zompist page is (of course) excellent:

• We’re skeptical only of numbers we don’t like. You don’t have to be trying to defraud anyone to fool yourself. But it’s hard not to avoid our own bias in favor of numbers that go the way we want them to. If they don’t, we’re displeased, and skeptically examine our calculations, and seize on new assumptions that create better numbers. If the numbers do behave, we look over our results and our assumptions more carelessly. (Feynman gives a nice example: an incorrect estimate of a physical constant, which took decades to be corrected, each correction edging toward the correct value. Since the original estimate was assumed to be correct, scientists were more than usually cautious about new estimates that deviated from it, and easily convinced themselves that they’d made a mistake somewhere. Estimates near the accepted value were given more slack.)
• We don’t double-check our results against reality. Few who make these calculations have looked for random matches in languages they’re sure are unrelated. In other words, they have no control case to check their results against. It’s no wonder they never notice how common random matches really are.

(I have looked for random matches, and found plenty of them; see my lists of Chinese/Quechua and Chinese/English pseudo-cognates.)
• We haven’t worked the numbers. Even trained linguists, though they know that random matches will occur, generally can’t say how many.
• Statistical or linguistic ignorance. The calculations are often ruined by elementary statistical errors, or by wildly unrealistic linguistic assumptions, or by disregard of the enormous phonetic and semantic leeway comparers allow themselves.
Jerry Friedman says

August 15, 2024 at 11:33 am

Thanks, David Eddyshaw and languagehat. I think I probably saw a reference to Ringe’s paper at Language Log but forgot about it.

I’m not entitled to an opinion on “seven” or “honey”, but to the limited extent that I’m entitled to one on evidence of borrowing or relations between languages, I strongly agree that you need to consider, in some statistical way, the probability of coincidence. Yes, easy for me to say.

Ryan: Were there cultural meanings to 7 in Semitic cultures other than/before the Judaic creation story?

The Hebrew word, sheva` in modern Israeli Hebrew, also means ‘oath’, and the two meanings are said to be connected, as at https://www.balashon.com/2006/05/sheva.html . I don’t know whether there’s anything like that in other Semitic languages, or whether some people believe the meanings aren’t really related.
Rodger C says

August 15, 2024 at 1:04 pm

I just searched “Basque” in this post and found only one mention, so: sei ‘six’, zazpi ‘seven’.
Rodger Cunningham says

August 15, 2024 at 1:06 pm

I just searched “Basque” in this post and found only one irrelevant mention, so: sei ‘six’, zazpi ‘seven’.
languagehat says

August 15, 2024 at 1:33 pm

I just searched “Basque” in this post and found three mentions, so soon we’ll have six or seven.
Trond Engen says

August 15, 2024 at 2:35 pm

Berber six and seven seem to adhere to the pattern, but I don’t know enough about Berber to know which of the several forms I find to present. I’m waiting for Lameen.
Y says

August 15, 2024 at 2:41 pm

The Wiktionary entry for Proto-Kartvelian *šwid- ‘seven’ is exemplary in that it is fully referenced, with links to scans of all of the references. The summary of the etymology is thin, though.
Trond Engen says

August 15, 2024 at 2:47 pm

And of course Egyptian (per Wikipedia):

6: sjsw or jsw (?) (masc.) / sjst or jst (?) (fem.)

7: sfḫw (masc.) / sfḫt (fem.)

It’s interesting if the numbers can (and especially if they can’t) be reconstructed for PAA.
Ryan says

August 15, 2024 at 2:52 pm

>Berber six and seven seem to adhere

Of course, so do Coptic soou and sasfe. Berber, Coptic and Semitic may share descent from an Afro Asiatic root rather than showing a trade route wanderwort.

[Edited – my Coptic source was Omniglot and I didn’t pause to think about era, so it must be modern, and Trond’s Egyptian etymons are more relevant.]

I’d have guessed if there was any Basque relationship, it was borrowed during Roman times. (And I’m still in the IF camp on all of this.) But it’s still startling to see the Coptic and Basque together.

And truly startling just how many sibilants there are in six and seven roots around the Mediterranean/Middle East/Black Sea, in different language families. We’re at IE, Afro-Asiatic, Hurro-Urartian, Kartvelian, Basque.

I mean whatever. It still may be coincidence. But it is startling.
Trond Engen says

August 15, 2024 at 3:11 pm

And then Etruscan. I raise my hand for the duodecimal interpretation.
Ryan says

August 15, 2024 at 3:14 pm

I’m not gonna mention that the proto-Semitic 3 is a t and a liquid. That way madness lies.

>I raise my hand…

I raise both my 6-fingered hands.
Trond Engen says

August 15, 2024 at 3:17 pm

But wherever we go, Václav Blažek was there before us and left his name on Wikipedia’s list of references.
Jerry Friedman says

August 15, 2024 at 4:03 pm

I mean whatever. It still may be coincidence. But it is startling.

OK, it’s starting to startle me.

On the other hand, with however many digits, I like the way the Etruscan word for 1 is cognate to our word for 2.
Ryan says

August 15, 2024 at 5:24 pm

>On the other hand, with however many digits, I like the way the Etruscan word for 1 is cognate to our word for 2.

Not sure of your level of sarcasm here, or whether you just mean you like how it happened that the Etruscan 1 word looks like an IE 2 word. Cognate is a technical term implying a real relationship, which seems more than unlikely. Despite the phrasing in that section of the wiki Trond linked to, I’m not even sure those researchers believe the words are cognate. I can’t imagine someone postulating that seriously without a really solid chain of evidence. Not that much is even known at all about Etruscan, let alone Etruscan etymology.

The whole IE-derived Etruscan numerals section of the wiki looked less like crackpottery than hoax. For the Hattery’s amusement:

Conversely, other scholars, including F. Adrados, A. Carnoy, M. Durante, V. Georgiev, A. Morandi and M. Pittau, have posited a “perfect fit” between the ten Etruscan numerals and words in various Indo-European languages (not always numerical or with any apparent connection), such as θu ‘one’ and Sanskrit tvad ‘thou’, zal ‘two’ and German zwei ‘two’, ci ‘three’ and Iranian sih ‘three’ (from proto-Indo-European *tréyes, which is not a match to Etruscan [ki]), huθ ‘four’ and Latin quattuor ‘four’, etc.[10][11][12]

I love the dry wiki editor who added “not always numerical or with any apparent connection.”
Xerîb says

August 15, 2024 at 6:59 pm

We’re at IE, Afro-Asiatic, Hurro-Urartian, Kartvelian, Basque.

Lakota šákpe and šakówiŋ. 😄
David Eddyshaw says

August 15, 2024 at 7:05 pm

Let’s face it, “six” and “seven” are obviously phonaesthetic words.

Kusaal ayɔpɔi “seven” has a /p/ in it. QED.
[And the -y- actually comes from *d (long story.) And as we recently established in another thread, the change *s > *d is Totally a Thing.]
Jerry Friedman says

August 15, 2024 at 11:56 pm

Not sure of your level of sarcasm here, or whether you just mean you like how it happened that the Etruscan 1 word looks like an IE 2 word.

I was seriously startled, but I just like the way the 1-2 thing happened, and I suppose the temptation for someone on the lunatic fringe to think the words were really related. I hadn’t read the theory mentioned in the Wikiparticle that the Etruscan for 1 is related to “thou”, but I’m sure I could have come up with some sarcasm.
Ryan says

August 16, 2024 at 12:16 am

> Lakota šákpe and šakówiŋ

In contrast to their Hocaąk cousins. Clearly only the Lakota established trading contacts with the Phoenicians.
Lameen says

August 16, 2024 at 4:20 am

Berber six and seven seem to adhere to the pattern, but I don’t know enough about Berber to know which of the several forms I find to present. I’m waiting for Lameen..

Most Berber varieties use Arabic loanwords, but sḍis and sa are representative for those that still use reflexes of the proto-Berber forms. They are a little too similar to proto-Semitic, and have been suggested as possible very early loanwords.
David Eddyshaw says

August 16, 2024 at 5:53 pm

Arabic surra سرة “navel” must surely be a loan from Waama surufa (plural surusu) “navel.”
Ryan says

August 16, 2024 at 6:52 pm

David E.,

I’m curious whether you’d offer any criticisms of this paper on “Numeral Complexity in Hunter Gatherer Languages”.

The paper looked at HG languages in Australia, Africa, South America and the California Great Basin region. They found that 92%, 41%, 61% and 0% respectively have number terms ending at 5.

Um, I thought to ask you because of your interest and expertise in African languages, but maybe all speakers of Oti-Volta languages are agriculturalists or herders. Well, today they’re more likely doctors, truck drivers and short order cooks, but anyway. Maybe you’re still interested.

Is there someone here who has enough familiarity with HG languages to comment? Or who is interested enough to look at the paper and knowledgeable enough to find its flaws?

It seems to offer support for the idea that a wanderwort might have filled a gap, supporting the idea that the “continental scale consonance” in the words for six and seven in much of Eurasia and North Africa might be more than coincidence.

Many of the families with the putative wanderworts provably, and others likely, were in contact with each other at the moment that as hunter gatherer populations, they became connected to the economic zone expanding out of Mesopotamia and Anatolia.

By my count, from the pillars of Hercules to the Caspian, the language families that don’t fit the paradigm are
– Sumerian, NW and NE Caucasian;
Those that do fit are
– Afro-Asiatic, IE, Kartvelian, Etruscan, Hurro-Urartian and Basque
And one offers a match on seven but not six:
– Uralic

Have I missed any ancient language families. Turkic moved in. Dravidian does not fit, but if the theorized mechanism is a new need for higher numbers as the scale of economic activity increases, Dravidian may have been part of an independent early economic development.

The paper of course isn’t a perfect fit at all for the theory, since many of these HG groups have been in contact with agriculturists for millennia. To pursue that idea, the Australian situation may be most relevant, since the only subsistence pattern was HG. The prevalence of numbers above 5 is very low.
David Eddyshaw says

August 16, 2024 at 8:12 pm

maybe all speakers of Oti-Volta languages are agriculturalists or herders

Indeed they are; and in fact a good many agricultural terms can be reconstructed to proto-Oti-Volta, including “millet” (at least two distinct varieties), “rice”, “granary”, “plant”, “sow” (i.e. by casting, as opposed to planting individual seeds) … and, not least “beer.”

“Cow” is readily reconstructable, too, as is “goat” and a specific verb “look after a herd/flock.” Currently, though, most Oti-Volta speakers are arable farmers rather than herders (the Fulɓe tend to fill that niche, as in many other parts of West Africa.)

“Iron” and “blacksmith” are reconstructable too.

I don’t know about trade: the word for “money” looks like a loanword from Western Oti-Volta when it turns up in the other branches, and “sell” shows some irregular correspondences which also suggest borrowing. “Buy” must go back a long way, but the commonest etymon for it is absent in all of Eastern Oti-Volta.

The numbers up to at least the hundreds must go back to POV: “thousand” looks reconstructable, but again there are some odd correspondences suggesting borrowing.

But looking farther afield, it’s certainly the case, as I said above, that lots of West African languages have clearly made up the words for “six, seven, eight, nine” out of “five plus one” etc. (Fulfulde is one, for example.) And I don’t think there is any prospect of reconstructing those numbers for proto-Volta-Congo.

Agriculture goes back a long way in those parts. I don’t really know anything about hunter-gatherer languages, apart from some forest Bantu groups; they’ve all just kept the Bantu etyma for the numbers anyway (and they also tend to have vigorous trade relationships with neighbouring farmers.)
Jerry Friedman says

August 16, 2024 at 8:13 pm

Have I missed any ancient language families.

Elamite? But this word list at Wiktionary gives numbers only up to five.
Y says

August 16, 2024 at 8:46 pm

Have I missed any ancient language families.

It would only be courteous to mention Hattic.

(Schrijver and Kroonen, I think, believe that some Germanic words are of Hattic origin, from back when the Proto-former was neighbor to the latter. Hattic numerals are not recorded, AFAIK.)
AntC says

August 16, 2024 at 10:09 pm

Have I missed any ancient language families.

Do you count Polynesian/reconstructs to Austronesian as ‘ancient’? They’re Hunter-Gatherers of the sea, at least. five, seven, eight. I see no additive structure.
David Eddyshaw says

August 16, 2024 at 10:16 pm

Interesting paper, Ryan: thanks!

I note the bit on the fairly common borrowing of numerals under “five”: I despair of the way that “Niger-Congo” maximalists assume that the evident similarity of (say) Mande or Dogon words for “three”, “four”, “five” to Volta-Congo constitutes convincing evidence of a genetic link. It really doesn’t.

(Even on their own wildly optimistic assumptions, you’re talking about a “Niger-Congo” protolanguage at a time-depth of something like 10,000 years ago or more. Everybody was a hunter-gatherer in West Africa back then. Same as Europe. Ah me! Those were the days …)
Ryan says

August 16, 2024 at 11:24 pm

How could I skip Hattic? And wasn’t there a language isolate newly discovered in the last year, based on the Hittite ecumenistic urge to write down everyone’s rituals in case theirs was the true god. (After looking it up, nope, that’s thought to be a new Anatolian language.)

Languages where we don’t know their number words aren’t useful for my exercise except as reminders of gaps in the data.

AntC, I think it’s David E. who mentioned additive structure. I’m focused on language families whose Hunter-Gatherer proto-speakers would have first come into contact with an economy of a different scale via the Middle East/Anatolian Neolithic expansions. Those where it could be plausible that six-seven similarities represent extremely ancient wanderworts rather than mere chance. I think it’s 6 families with the paradigm, 3 without, 1 score-draw and 2 games in progress — Elamite and Hattic.

Were the Polynesians really hunter-gatherers? I thought you believed that they domesticated kumara (or is your position just that it grew there and they learned to use the wild plant?) But more important, the organization of Polynesian societies I’m aware of doesn’t seem very HG-like. From the US Pacific NW to the Dnieper rapids, I think there’s an understanding in archaeology that fishing cultures even in the absence of (significant) agriculture can still be very different from hunter-gatherer cultures, partly through the effects of aggregating in larger numbers in stable sites.
Ryan says

August 17, 2024 at 1:57 am

I just finished the paper. It is interesting. Most of the methodological questions I had are pretty well answered in the paper, so I think it says what it says with appropriate awareness and humility.

It raises a thousand questions that I’ll never have answers for.

But I do feel that the plausibility of my sixes and sevens theory is supported by this paper. A great many hunter-gatherer languages have no numbers above 5, and the likelihood they do is related to their proximity to other subsistence strategies (the focus of the paper and its statistical approach) but also other economies (something they only mention a couple times in passing, rather than statistically). So it seems reasonable to propose that at some great time depth, the proto-protos of many of our Eurasian language families had no words for six and seven, until they began trading with the network expanding out of Mesopotamia/Anatolia and borrowed the words used in the network.

Plausibility. Certainly not confirmation.

Conversely, the paper makes clear that although many smaller scale agricultural groups do have high-limit numbering systems, many others have low-limit and restricted systems. This makes it seem less likely that mere contact with the Neolithic demographic wave would have had such a dramatic effect on Eurasian speech. So “sixes and sevens” may be a marker of the development of, or contact with, proto-urban trading systems.

While this idea doesn’t contradict anything about the known time-frames of Hurrian, IE, Etruscan, Basque and Kartvelian, it could have fascinating implications for Afro-Asiatic, since the continuity of terms across Semitic, Egyptian and Berber might imply that they were already participating in such an economy prior to their branching off. Otherwise, one might expect higher numeral terms to have dropped out at some point.

Though I guess it’s difficult to differentiate between family-scale cognacy and ancient intra-family borrowing at this time depth. Could we recognize the difference between Berber and Semitic sharing the aboriginal Afro-Asiatic roots for 6 & 7 with Egyptian from B & S picking up the Egyptian terms around 4,000 bce through trade with early Nile urban populations, and then adapting the terms till the evolution of the phonemes is no longer transparent.

Cushitic, Chadic and Omotic do not seem to partake of the putative wanderworts. That’s a twist in the plot for me. But maybe that just adds to the likelihood that these were wanderworts even within AfroAsiatic, long post-dating its divergence. Perhaps the original roots were in Egyptian.
Ryan says

August 17, 2024 at 2:25 am

Another possible implication, if we were to accept this hypothesis, is that Hurrians, proto-Kartvelian speakers and proto-IE people shared in that early economy in a way that the speakers of NE and NW Caucasian languages may not have.

It’s also interesting that using some concepts from the number systems paper above, Sumerian doesn’t have lexified terms for 6-9, but instead has transparently derived terms: 5-1*, 5-2, 5-3 and 5-4. At the first link, the terms are derived simply from lower numbers. The writer at the second link seems to be suggesting that the number words had started to lexify, but they are still completely transparent. The authors of the systems paper suggest that transparency is a sign of recency. But it’s also obviously a language-internal development.

And it’s also interesting that in the known languages and language groups, no one seemed to borrow or calque Sumerian higher digits. Could this tell us something about Sumerian participation in the early up-river trade routes?

* – Just guessing, but the difficulty of distinguishing i-as from as may have led to the change to dis for 1, and the dropping of the initial i=5 in the word for 6.
AntC says

August 17, 2024 at 5:36 am

long post-dating its divergence. Perhaps the original roots were in Egyptian.

Weee…ll Something like 2,500 BC in proto-proto-who-knows-what, somebody ordered a stone 4.9 x 1.0 x 0.5m weighing 6t (note the suspiciously round metric measurements) to be brought ~750km from NE Scotland to Wiltshire. I can only boggle at what sort of counting and measurement this entailed.

As for the megaliths at Göbekli Tepe …
Trond Engen says

August 17, 2024 at 7:30 am

No reason to snark. I don’t think anyone claims that wanderwort status is demonstrated beyond doubt, or that it ever will be, but it’s a coincidence on a scale that it would be irresponsible not to have in mind while working on the individual language families or the contacts between them. Identifying loanwords is pretty important.

And if linguists working on single language families need to keep it in mind, it would be helpful to have comparative or comprehensive work on the whole consonance cluster. That would come with a lot of maybes, if-this-then-thats, but maybe that’s what a scholar of Basque needs to be able to say that a preform of Berber-if-borrowed-from-Egyptian would actually help making sense of “seven” but not “six”.

Actually, megalithic engineering and the apparently tightly connected early astronomy, is a field where the need for standardized high numbers might arise, but I’d expect those to be more specialized terminology than colloquial language. But it would, of course, be a complicating factor in the hypothesis that the words instead (or also) might have spread as scientific terms long before seeping into the colloquials with e.g. broad access to trade.
Ryan says

August 17, 2024 at 9:22 am

Nearly round is a synonym for I eyeballed it, not I counted out units. And megaliths weren’t being built by hunter gatherers anyway. The idea isn’t that agriculturalists couldn’t develop higher numbers. It’s that hunter gatherer interactions wouldn’t often sustain the need for them, but that HG participation in urban trade networks could generate that need and provide the words to fill it.
drasvi says

September 14, 2024 at 8:44 pm

Well, the familiar (from the clock dial) system of 12 and 60 is natural, because these can be conveniently divide in 2, 3, 4 and 5 (for 60) parts.
6 could be convenient to someone who deals with halves and thirds of six or halves and tenths of 12 or 60.
7 is extremely inconvenient in the very same sense:)
__
If both were significant one would expect 42 or 420 to occur unusually often in the region.
drasvi says

September 14, 2024 at 11:54 pm

“The ultimate goal is to create an atlas that maps where the words were used and when.”

I approve. Wish there were more such things.
drasvi says

September 15, 2024 at 4:36 pm

Trond’s is more accurate about numbers. He says that diverse systems could have been replaced by one.

He is not saying that
– any given system is more convenient for trade than others
– that anyone “does not have numbers above N” (e.g. 5).

Only that when a system “one two two and one two and two hand hand and one hand and two…” clashes with a system “one two three… ten ten and one ten and two…” one of them may replace the other.
____
Though this per se does not mean that users of “hand and five” will be more willing to borrow a word for “seven” than users of “seven”.
drasvi says

September 15, 2024 at 7:15 pm

“Were there cultural meanings to 7 in Semitic cultures other than/before the Judaic creation story?”

Ryan, I’m definitely not a specialist but the number is just as definitely important in Sumerian (and not only) culture. As someone on Reddit suggests, count its mentions in The Literature of Ancient Sumer. Also https://en.wikipedia.org/wiki/Week#History
David Eddyshaw says

September 15, 2024 at 7:58 pm

The seven-day week seems to go back at least to Sargon in the third millennium:

https://en.wikipedia.org/wiki/Week

Whether you think this antedates the Biblical creation story presumably depends on your religious views …

“Three” (male) and “four” (female) are the only culturally supercharged numbers in the Western Oti-Volta speakers’ cultures. Other numbers are just numbers. (Even now, many Kusaasi don’t grok “week”; we used to get many patients asking us how many days we meant if we asked them to come back in N weeks. Weeks are a Muslim thing around there.)

But then, proto-Volta-Congo doesn’t seem even to have had any unanalysable word for “seven.” (Proto-Oti-Volta did, though.)
drasvi says

September 17, 2024 at 1:59 pm

Well, I suppose gestural counting means more analysable words (up to one, two, two and one, two and two, hand…). Because you dont utter each too often.
drasvi says

September 19, 2024 at 8:05 am

“Elamite? But this word list at Wiktionary gives numbers only up to five.”

Blažek (Dravidian numerals) says that Elamisches Wörterbuch has 1 ki 2 mar 3 ziti 5? tuku 80? barba. Also: “It can also be mentioned that F. König ([1965: 42, fn. 15]) offered to interpret the Middle Elamite word nulkippi as “4 pairs”, i. e. “8”. If his solution were correct, the hypothetical root *nul– could be a cognate of Dravidian *nāl “4”. However, Hinz & Koch ([EW: 1016]) interpret this word quite differently, namely, as a plural form of the noun ‘fertility-maker’.”

“8” and “fertility-maker” is not an unexpected pair of “translations” for a word from an ancinet text:( Especially “fertility-maker”, 8 is profane.

Jan Tavernier in his sketch of Elamite (here, open access) for The Elamite World mentions only ki “1”. He then discusses its another meaning in Achaemenide Elamite, “each” and the vowel change and then swtiches to suffixes for numerals and fractions. Perhaps he finds others problematic.
Ryan says

September 19, 2024 at 11:30 am

This is why I left Elamite as a draw. I didn’t assume lack of attestation meant that there were no words for six or seven. We just don’t know what they are or whether they match the potential wanderworts.
Dmitry Pruss says

October 21, 2024 at 12:46 pm

2023 Thesis of Chams Benoît Bernard, a student of Peyrot, is now available online:
https://www.universiteitleiden.nl/en/research/research-output/humanities/like-dust-on-the-silk-road-an-investigation-of-the-earliest-iranian-loanwords-and-of-possible-bmac-borrowings-in-tocharian

Bernard adds to the list of known Iranian loanwords into proto-Tocharian, and, from their consistent phonology, concludes that all of them have come from the same old Iranian language, which wasn’t related to the Khotanese (the closest Iranian-family neighbor of the Tocharians). Instead, the mysterious “Old Steppe Iranian” language, probably spoken North of the Tian Shan, shared a lot of phonological innovations with Ossetian (but it had at least one additional innovation not found in Ossetian, and therefore, is unlikely to have been the direct ancestor of Ossetian).

Bernard also uncovers a dozen non-Iranian borrowings into proto-Tocharian. These words largely refer to natural phenomena. The hypothesis is that they came from an unknown language of the BMAC
languagehat says

October 21, 2024 at 12:54 pm

Exciting stuff!
David Marjanović says

October 23, 2024 at 7:04 pm

– Metal fragments — essentially all metal fragments in all Europe — were cut up in pieces on a standard base of 10g.

Proof that the Deka is natural and the Germans are sadly missing out.

The 10g unit is an adaptation of the Mesopotamian shekel.

Oh, that’s even more interesting.

100 can be safely reconstructed for PIE, and perhaps even 1000 – at least, *g’heslo- is a Greek – Indo-Iranian – Italic isogloss.

What’s the Italic reflex?

Hittite for seven is sipta. Assuming that’s the expected form for a word from the root that gave us Sanskrit and Greek hepta

It is. The evidence is a bit thin – namely, šiptammiya- is thought to mean “7th”, IIRC it’s only attested once, and the cardinal number has not been found spelled out –, but, AFAIK, it would fit perfectly. Skt saptá, Greek heptá and the application of Verner’s law in Germanic *sebun make clear that their last common ancestor was, bizarrely, *septḿ, with unstressed *e and a stressed syllabic consonant. (The stress is probably copied from “8”, which was some sort of *HokʲtóH.) In Hittite, *e preceding the stress comes out as i, while *é is preserved as such and *e following the stress comes out as a, so even the unusual stress fits.

(š is used to transcribe cuneiform ultimately by reading Akkadian with a Hebrew accent.)

Personally, I think all this is just the awesome power of sheer coincidence*, but if you were determined to see an origin in Semitic it wouldn’t be surprising if it went back to the form used with masculine nouns, which has the feminine ending -t Because Semitic.

It has indeed been noted in the literature that its indeterminate-or-something form *sabʕatum (…right…?) is even more similar… assuming the very voiced *-bʕ- ended up as a PIE *p by contact with the *t after IE invented the zero grade (but before the stress shifted from the first to the last syllable by the abovementioned analogy to “8”).

And the word for “money” seems to be really a loan from Western Oti-Volta in all the other branches, on morphological grounds, even though you can reconstruct a POV form for it without breaking any Neogrammarian rules. If counting does go with money historically, that could be a hint.

Oh, that reminds me of silver being a Wanderwort that stretches from the Atlas to the Don and may have a Semitic etymology. Check out section 3.39 in this open-access book chapter. It very much does not have the same geographic distribution as seven, though.

And _is_ it related to four?

No. The idea was that it’s the dual of an otherwise lost word for “4” that was completely replaced, but more likely the “breadth of 4 fingers” word is derived from “pointed” instead of from any word for “4”. The Anatolian word for “4” is completely different.

(Schrijver and Kroonen, I think, believe that some Germanic words are of Hattic origin, from back when the Proto-former was neighbor to the latter. Hattic numerals are not recorded, AFAIK.)

Not of Hattic origin; but they think that the ante-IE languages of generic west-central Europe filled the large gap between Basque on one side and Hattic + Minoan on the other.

the Hittite ecumenistic urge to write down everyone’s rituals in case theirs was the true god

Nonono – like the Romans, the Hittites assumed all gods were true. Unlike the Romans, however, they didn’t think all gods were fluent in all languages, so they wrote down all rituals they could get a hold of in the original languages – followed by a translation when we’re lucky.

Also unlike the Romans, they didn’t equate similar gods worshipped in different cities. Not even the sun goddesses.

Were the Polynesians really hunter-gatherers?

No; and many of the Polynesian agriculture-related words go straight back to Proto-Austronesian. Agriculture was largely abandoned in New Zealand, though, because it’s too cold for tropical agriculture there.
Ryan says

October 23, 2024 at 9:58 pm

Didn’t we have a thread on speculation (by Schrijver?) of a link between shekel and schilling? Does this theory of the shelel as being at the root of a widespread measuring system make that more plausible?
Y says

October 24, 2024 at 1:32 am

Here, and that was a Vennemann idea. Schrijver reaches sometimes, but doesn’t fall off the end like Vennemann does.

Lehmann’s Gothic Etymological Dictionary says (edited for style),

S90. *skilliggs ma = ‘solidus’ (the golden solidus of the Eastern Empire)

[…]

Etymology disputed. Possibly based on PIE (s)kel- ‘split, cut’, like Gk κέρμα ‘little coin’ κείρω ‘to cut’ Persson 1893–94 ZVS 33:286–87; cf also OE hielf-ling based on healf ‘half’. skilliggs is then a coin that was “cut”.

Assumed by E Schröder 1918 ZVS 48:254ff to be from *skild-liggs ← *skildu-liggs, ie a ‘small shield’; cf Port escudo, Fr écu ‘dollar’ ← Lat scutum ‘shield’, MLat scudatus aureus. Marstrander 1924:25 on the other hand proposed *skildulingaz as a designation for Roman coins because they resembled the medallions (clipeoli) on gravestones. Older assumption of jingling objects, as to OI skjalla, OE sciellan, OHG scellen stv ‘resound, jingle’ Diefenbach 1846–51 II:249, no longer favored. Marstrander also rejected Brøndal 1917:147ff, who assumed a borrowing from Lat *silicula, OHG silihha f ‘little coin’, VLat *skella. Borrowed into Sl as OSl ščьljagʒ, Stender-Petersen 1927:380ff, Vasmer 1953–58 III:453.
David Marjanović says

October 24, 2024 at 8:24 am

Previous discussion of shilling here, addendum here.
Hans says

October 26, 2024 at 9:15 am

What’s the Italic reflex?
Latin mille, which is usually reconstructed as *smiH2-ghsli- or so, i.e., a compound with “one” as the first element (I’m not near my library or computer now, but if you have access to de Vaan, you can check the exact reconstruction there).
David Marjanović says

October 26, 2024 at 11:07 am

Ah yes, I’ve come across that before.
Dmitry Pruss says

May 28, 2025 at 10:18 am

A new book by Chams Benoît Bernard, “Like Dust on the Silk Road: On the Earliest Iranian and BMAC Loanwords in Tocharian”, is open access
https://brill.com/display/title/72419
languagehat says

May 28, 2025 at 10:23 am

Interesting stuff, thanks!

While there is no clear spelling of the stress in Tocharian A and in Archaic Tocharian B (Peyrot 2008), stress is indicated in Classical and Late Tocharian B in the following way. When stressed, the Tocharian B phoneme /a/ is written as ‹ā› (there is no phonological length in Tocharian B), but as ‹a› when unstressed. The Tocharian B phoneme /ə/ is written as ‹ä› when unstressed, and as ‹a› when stressed. There is thus sometimes a spelling ambiguity between unstressed /a/ and stressed schwa /ə/, which are both spelled ‹a›. This ambiguity is usually solved by either etymology or by variants of the same word. Indeed, either an archaic spelling or a suffixed or inflected form, such as the plural ending, can confirm that the vowel was originally a schwa /ə/ or /a/. For example Tocharian B yasar ‘blood’ is phonologically /yə́sar/ rather than /yasə́r/, as can be deduced from its plural ysāra /ysára/. Besides, Tocharian B words could never be accented on the final syllable (Krause 1971: 11), so that there is no doubt about the stressed syllable in disyllabic Tocharian B words.

I’m glad there’s so much in open access these days.
languagehat says

May 28, 2025 at 10:27 am

There’s a nice map (figure 2) on p. 243.
David Marjanović says

May 28, 2025 at 12:37 pm

I learned about Tocharian stress a few years ago, probably from reading Ronald Kim’s works on his Academia page. Open access FTW indeed.
Dmitry Pruss says

March 11, 2026 at 8:34 pm

A new genomics paper shows that Afanasievo ancestry persisted in Eastern Tian Shan and Northern Tarim Basin in Iron Age (300-700 BCE) in a form mixed approximately 50:50 with Upper Yellow River Late Neolithic millet farmers. Is it the right time frame for Tocharian, bridging the gap from the much earlier Afanasievo?

The authors date Afanasievo – Upper Yellow River farmer admixture to ~1500 BCE.

In the same Iron Age time frame, Western Tian Shan was dominated by Andronovo descendants with a noticeable BMAC-like admixture (Gonur)
https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msag057/8513100?login=false

Of course a similar claim of being half-Afanasievo has already been made back in 2019 about the Iron Age Shirenzigou, although lived a little too far East from Tarim basin…
https://www.cell.com/current-biology/fulltext/S0960-9822(19)30771-7
Trond Engen says

March 12, 2026 at 6:18 pm

Thanks!

To get some continuity in the Tocharian discussions, I think we can go back to my upthread (August 2024) attempt on a summary. Quoting from there:

Maybe pressed eastwards by the Andronovo, many of the Chemurchek settled in the Hexi(/Gansu) Corridor and formed the Xichengyi culture (c. 2000-1600 BCE), probably with a strong local substrate of old trading partners from the Majiayao. The Xichengyi interacted closely with Upper/Middle Yellow River cultures like the Qijia culture (c. 2200-1600) of the eastern Gansu and the Erlitou culture (identified with the semi-mythological first Chinese dynasties),

[…]

The settled/semi-nomadic Xichengyi became the Siba culture (c. 1600-1300 BCE) who gave room to two independent but related cultures in the Hexi corridor, Shanma (c. 900-200 BCE) in west, and Shajing (c. 800-100 BCE) in east (I’m not sure what to think about the hiatus in the archaeological dates, but it ends about the time of the Mu Tianzi Shuan). Both cultures were heavily influenced by the Saka, who now dominated the eastern steppe. The location and the end dates fit well with the Xiongnu and Han conquests of the Hexi Corridor and the expulsion of the Yuezhi and the Wusun. The Yuezhi were the first to leave, so they may have been the Shanma. I think both were Tocharian, but I have no clear idea which was A and which was B.

The new paper’s date of ~1500 BCE is in the middle of the Siba phase, exactly when we’d expect admixture between newly settled Afanasievans and the Yellow River locals.

AD 300-700 in the Eastern Tian Shan/Northern Tarim would be consistent with being descended from the expulsed Yuezhi and Wusun.

I’ll come back when I’ve had time to read the papers.
Trond Engen says

March 12, 2026 at 6:19 pm

Editing my double post to grudgingly admit that all those names of Upper Yellow River cultures and how they relate to eachother are completely gone from my head now. But I did remember the Hexi Corridor melting pot.
Dmitry Pruss says

March 12, 2026 at 11:36 pm

Yuezhi, or at least one of their constituent tribes, became the Kushans

https://onlinelibrary.wiley.com/doi/10.1111/1467-968X.12269

An Eastern Iranian language has been retained there. If we conclude that Iron Age Eastern Tian Shan peoples in this structure are also Iranic then it wouldn’t be very useful for the hypothetical Afanasievo – Tocharian link.

Linguistically , would it make sense to attribute irrigation agriculture vocabulary in Tocharian to the Yuezhi-like mixed Iranian stock populations?
David Marjanović says

March 13, 2026 at 12:09 pm

I don’t know, but there are quite old Iranian loanwords in Tocharian – older than [dz] > [z] even.
Trond Engen says

March 13, 2026 at 1:31 pm

Somebody has surely catalogued those loanwords according to phonological age and semantic field. Tocharian must (if it came out of Afanasievo) have been neighbours with various forms of Iranian for just about as long as both branches existed.
David Marjanović says

March 13, 2026 at 2:02 pm

There’s this paper from 1999… which reminds me (on p. 133) that it can’t be excluded that the words with ts for Iranian /z/ are actually from Nuristani…
Trond Engen says

March 14, 2026 at 7:06 am

Thanks, David. That paper (and the existence of the Ringe paper it builds upon) was buried deep down in the back of my head. Trying as it does to date linguistic changes by archaeological evidence for contacts with Iranian and Chinese, it should be reassessed with what we now think we know about the Tocharian trek. The 2nd millennium BCE cultures of the Hexi Corridor funneled Bronze Age innovations to China and Chinese innovations to the steppe.
Trond Engen says

March 15, 2026 at 1:01 pm

@Dmitry: I’ve tried to understand the implications of the new papers for the hypothesis above. Until further evidence, from the Hexi Corridor, from the Ili Valley, and from what would become the Tocharian cities, I think the best fit is still that the Wusun and the Yuezhi were the linguisitc ancestors of the two branches of Tocharian,

For a millennium or more, they were neighbours with the Iranians of the steppe, exchanging technology and borrowing terrminology. Since the Middle Iranian period conventionally starts in the mid-1st millennium BCE, borrowing from Old Iranian would most likely have happened in the period when Tocharian was spoken in the Hexi Corridor. I’ll assign Proto-Tocharian to the Xichengyi and Siba phases, i.e. 2nd millennium BCE If the chain borrowings are true chain borrowings, thy must have happened at a later stage. The usual direction of chain borrowing seems to be Old Iranian -> Toch B -> Toch A, which might suggest that Toch B were the westernmost, i.e. the Yuezhi, but the evidence is slim). What looks like chain borrowings could also be replacements by cognate, in which case it says little of the time of the first borrowing BMAC borrowings could have come at that time or later, but can according to Bernard all be assigned to Proto-Tocharian. One might wonder if they aren’t from the Upper Yellow River substrate instead.

So what does the new genetic studies mean? I’ve still not had time to read it thoroughly, but the Yellow River admixture is strong evidence for the Hexi Corridor phase. We knew that after their respective expulsions, the Wusun and the Yuezhi settled and eventually found their places as clients and associates of their former trading partners and occasional enemies, the Iranian tribes of the Tian-Shan and the Ilir Valley. Now we know more about where, and about their interactions with surrounding peoples. If and when there’s enough genetic evidence from the Hexi Corridor itself, we may even start to untangle the Wusun from the Yuezhi.

(And for the sake of crackpottery: It struck me that Dzungar “left hand (land)” might be a Mongolian folk-etymological rendering of the ethnonym underlying Tochar.)
David Marjanović says

March 15, 2026 at 3:08 pm

Right, not left: it’s in the west, and (as is only sensible) you turn your back on Siberia and look toward China, so the west is to your right.
Trond Engen says

March 15, 2026 at 6:13 pm

I thought I’d misremembered, but at least WP on Dzungar people agrees with the “left hand” etymology. Those with better access to Mongolian sources may confirm.
David Marjanović says

March 15, 2026 at 11:24 pm

…Oh. The point of reference is Oirat, not Khalkh. Left hand it is, then. Carry on…
David Eddyshaw says

March 15, 2026 at 11:59 pm

the west is to your right

Nah, west is in front of you. (Kusaal tuon “in front; West.”)

But I seem to recall that “face South” is a Classical Chinese idiom for “assume imperial power.” Sadly, Google is now altogether unusable, so I am unable to substantiate (or refute) this.
Xerîb says

March 16, 2026 at 4:56 am

“face South” is a Classical Chinese idiom for “assume imperial power.”

This expression (南面 : 南 ‘south’, 面 ‘face’) is already in the Analects:

子曰：雍也,可使南面。

Confucius said, ‘Yong can be given a position facing south’. (6.1)

子曰：無為而治者，其舜也與？夫何為哉，恭己正南面而已矣。

Confucius said, ‘Is Shun not an example of someone who ruled by means of effortless action? What did he do? He did nothing but make himself reverent and take his proper position facing south.’ (15.5)

Handy comment on 6.1 from Edward Slingerland (2003) Analects: With Selections from Traditional Commentaries, p. 52:

There is some commentarial debate concerning how this praise of the disciple Zhonggong [i.e. Yong] is to be understood. “Facing south” is the proper ritual orientation of a ruler (15.5), and Han Dynasty commentators understand the point to be that Zhonggong is worthy of becoming a feudal lord or even the Son of Heaven. Qing Dynasty scholars see this as excessive praise for a disciple who—although praised along with others in 11.3 for “Good conduct”—does not feature prominently in the text. They note that even ministers who had some responsibility over people took the south-facing position when serving in an official capacity, and therefore believe this passage to be an expression of Confucius’ approval of Zhonggong taking an official position under the Ji Family (13.2).
languagehat says

March 16, 2026 at 8:02 am

“Facing south” is the proper ritual orientation of a ruler

Are there plausible ideas about why that was? I mean, I can make up ideas myself, but I can do that for any given orientation — we humans are good at rationalization.
Jerry Friedman says

March 16, 2026 at 8:37 am

Google finds the Wikipedia article under a title I wouldn’t have been able to figure out: Nanmian.

‘In the I Ching it is written “the nanmian hears the world, toward the light rules” (南面而听天下，向明而治). The meaning is interpreted that since south is the direction of the Sun (in China and the northern hemisphere), the ruler faces towards the light as he rules, with the light representing wisdom or virtue.[4]’

The reference, in Chinese, is https://www.guoxuemeng.com/guoxue/14955.html

Nothing on which direction that nice Mr. Xi faces in official meetings.

@DE: I’m not having any problems with Google as long as I skip the APE answer. Maybe it’s having a fit of pique with you for speaking disrespectfully of it. All praise to Alphabet!
Nelson Goering says

March 16, 2026 at 11:03 am

I found this thread through the “recent comments” side bar, and spent a bit too much time skimming through it. Apologies if I missed it, but I didn’t notice anyone mention the very interesting paper from Peyrot on the Iranic word for “iron”, which was also loaned into Tocharian. Interestingly, he argues that this was not a loan from the entirely asterisked Steppe Iranic language, but actually from Khotanese directly: https://onlinelibrary.wiley.com/doi/10.1111/1467-968X.12269 (I think you can also find this on Academia.)

It’s a good paper, except that his argument that Khotanese split off before the creation of Iranic *ts from PIE *ḱ is not well founded. See in particular Jakob Halfmann’s book on Nuristani, pp. 33-34: https://medialibrary.reichert-verlag.de/en/file/9783752003543_ebook.pdf (I’ve only read the first couple chapters of this book, but it’s really excellent so far.)

More generally, re Khotanese, folks here might be interested in the near-barrage (by the standards of these things) of publications that Nicholas Sims-Williams was involved in during the past couple of years:

A Handbook of Khotanese https://reichert-verlag.de/en/keywords/handbuch_keyword/9783895004445_a_handbook_of_khotanese-detail
An Old Khotanese Reader https://reichert-verlag.de/en/author/s/sims_williams_nicholas/9783752009170_an_old_khotanese_reader-detail
The Book of Zambasta https://reichert-verlag.de/en/author/s/sims_williams_nicholas/9783752006889_the_book_of_zambasta-detail

I hope this helps get more people involved with the language, which is linguistically very interesting and not unimportant to Central Asian history.
Nelson Goering says

March 16, 2026 at 11:19 am

I think the post I just made had too many links, and got sent off for moderation…
Y says

March 16, 2026 at 11:46 am

I think I got a comment stuck, too.
Y says

March 16, 2026 at 1:49 pm

Posting again: Here is an argument that the emperor is akin to the North Star, about which the universe turns, and hence his throne faces south.
Seong of Baekje says

March 16, 2026 at 1:56 pm

‘In the I Ching it is written “the nanmian hears the world, toward the light rules” (南面而听天下，向明而治).

Better translated as “Facing south to listen to [the affairs of] the world, ruling toward the light.”
languagehat says

March 16, 2026 at 2:26 pm

I think the post I just made had too many links, and got sent off for moderation…

I rescued it, and I thank you for those very interesting links!
J.W. Brewer says

March 16, 2026 at 2:35 pm

If the Emperor is located (metaphorically) at the North Pole, or perhaps by his very presence brings with himself the north end of the Axis Mundi, then I guess by definition he is always (metaphorically) facing south regardless of how he may twist or turn? He would have no other options!
Trond Engen says

March 16, 2026 at 3:05 pm

@Nelson: Yes, thanks!

On Peyrot, your link is to Bonmann et al on the partial decipherment of the unknown script. Interesting as it is, it’s not what you intended!

On Halfmann, I downloaded this paper (monograph?) a few weeks ago, probably somewhere in the backlog of recent LanguageHat posts, but I don’t think I read past Ch. 1 Preliminaries on Language Diversification, which, as you say, is excellent. Will procede now.

On SIms-Williams, that’s a lot of good stuff, but unfortunately only the Tale of Bhadra is available to the common cheapskate.

Edit: I forgot this quote from Halfmann 2025, which should be added for its onomastic determinalism:

sharply distinct languages and the eventual family-tree effect can arise out of an earlier dialect network when expanding dialects replace their neighbors. If enough intermediate dialects are pruned, the remaining dialects will be sharply distinct
(Babel et al. 2013: 447)
David Marjanović says

March 16, 2026 at 3:11 pm

I’m not having any problems with Google as long as I skip the APE answer.

…which you don’t even get if you add -ai to your search. That also saves time and some scary amount of energy.

If the Emperor is located (metaphorically) at the North Pole, or perhaps by his very presence brings with himself the north end of the Axis Mundi, then I guess by definition he is always (metaphorically) facing south regardless of how he may twist or turn? He would have no other options!

Whenever the capital is the Northern Capital, there just isn’t anything interesting north of there, while the rest of the country is south of it. But Xī’ān is less far north, and all other capitals are rather central…
Brett says

March 16, 2026 at 3:43 pm

Previous mention of the Chinese mythology of the north star and polar axis
David Marjanović says

March 16, 2026 at 3:55 pm

the very interesting paper from Peyrot on the Iranic word for “iron”, which was also loaned into Tocharian. Interestingly, he argues that this was not a loan from the entirely asterisked Steppe Iranic language, but actually from Khotanese directly: https://onlinelibrary.wiley.com/doi/10.1111/1467-968X.12269 (I think you can also find this on Academia.)

The link leads to “A Partial Decipherment of the Unknown Kushan Script”, which is not about iron and not on Academia, but in open access.

Abstract, while I’m at it:

Several dozen inscriptions in an unknown writing system have been discovered in an area stretching geographically from Kazakhstan, Uzbekistan and Tajikistan to southern Afghanistan. Most inscriptions can be dated to the period from the 2nd century BCE to the 3rd century CE, yet all attempts at decipherment have so far been unsuccessful. The recent discovery of previously unknown inscriptions near the Almosi gorge, Tajikistan, however, allows for a renewed attempt at decipherment. Drawing upon a catalogue of characters and a distributional analysis, we report two identical sequences in the newly found Almosi inscriptions and in the Dašt-i Nāwur trilingual. Based on parallel texts in Bactrian, we suggest to read the name of the Kushan emperor Vema Takhtu in these sequences, accompanied by the title ‘king of kings’ and several epithets. This allows for the deduction of probable phonetic values of 15 different consonantal signs and four vocalic diacritics and the inference that the inscriptions record a previously unknown Middle Iranian language.

Previously discussed on LLog and I think here too.

On Peyrot’s Academia page there’s a paper about another loan from Khotanese to Tocharian B: yolo “bad”.
Nelson Goering says

March 17, 2026 at 1:04 am

Oops, I must have grabbed from the wrong tab. Here’s the right link: https://www.cambridge.org/core/journals/bulletin-of-the-school-of-oriental-and-african-studies/article/spread-of-iron-in-central-asia-on-the-etymology-of-the-word-for-iron-in-iranian-and-tocharian/6C6FEFB6DA1EA411B53E5531735A08C1
Nelson Goering says

March 17, 2026 at 9:31 am

“Previously discussed on LLog and I think here too.”

It’s linked to earlier on in this very thread! (It’s why I had that article open.)

Re yolo, I wonder if the Tocharians just preferred a more cautious approach over a carpe diem philosphy.

On a different note yet, I’ve switched to using Kagi as my search engine. It’s subscription based, but seems to work better than Google does these days (with or without -ai). It’s more like the Google of maybe 10-15 years ago.
David Marjanović says

March 17, 2026 at 3:38 pm

Here’s the right link:

And compelling it is – including the evidence for Khotanese-Tumshuqese-Wakhi as the sister group to the rest of Iranian!
Nelson Goering says

March 17, 2026 at 3:49 pm

That part I don’t find really compelling, since it depends on taking the palatal nature of śś as old rather than due to the labial. The reference I gave to Halfmann’s book directly addresses this point, and is better informed typologically and phonetically. I think the question is still just open.

Incidentally, I remember Melchert explaining the palato-velar in PIE *kjwon- “dog” by a similar appeal to the influence of *w (on phone, so apologies for the bad notation).
David Eddyshaw says

March 17, 2026 at 6:15 pm

Kagi is indeed worth supporting.
David Marjanović says

March 17, 2026 at 6:39 pm

Oh. (I’ve downloaded the book but not read it yet, so I… forgot that whole passage of your comment.)

[sw] > [ʃ] has precedents, of course*, and I figured such a [ʃ], which would be rare, could just have merged into ś [ɕ]; but, as the article points out, the preservation of the w argues against that…

…or maybe it doesn’t. I once heard a member of the Spanish parliament say es un sueño on the radio – with a really impressive [ʂʷ]. I guess you’re right.

* Fascinatingly, Luwian is one.

The Tocharian Trek.

Comments

Speak Your Mind

Archives

Search

Recent Posts

Recent Comments