How Basque Has Survived.

I don’t listen to podcasts, and I don’t link to a lot of audio stuff here, because I’m basically a [written-]word guy. But I do listen to the radio, and PRI’s show The World in Words is so exactly up my alley I’ve posted about it more than once (e.g., here). A recent episode (apparently first aired in May, though I heard it yesterday) is described thus:

This week on the podcast we talk about Basque. With more than six dialects, how did Basque develop a language standard? How did this language survive the military dictatorship of Francisco Franco when speaking, writing and reading it were illegal? How has this minority language thrived and even grown in the years since Franco’s dictatorship ended? And what does the future hold?

The main focus, thankfully, is not on the “what does the future hold” stuff but on the fascinating story of how it developed a language standard and the dramatic tale of the guy who translated Shakespeare into Basque and saved his work when a ship was attacked by Germans during WWII. It’s a little over a half-hour long, and at the bottom of the linked page is the “Podcast Contents,” which will tell you what bits occur when. (I haven’t listened to “How soccer became multilingual” yet, but I’ll bet it’s a lot of fun as well.)

Comments

  1. This is something of a tangent, but I wonder if what discussion there has been on the role that AI will have on the future of minority languages (and on the role of English as a lingua franca). When machine translation becomes good enough, many of the practical problems with speaking a minority language will disappear. It will become possible to translate any smartphone app or any biochemistry textbook into Basque very quickly. It would even become possible to travel the world without knowing any English or foreign languages since one will probably be able to communicate through a smartphone translation app. Also much of the need for anyone to learn a foreign language, even English, unless he or she wants to will diminish.

    I certainly don’t think this is an exaggeration of the potential of artificial intelligence in the near future.

  2. I’m confused, do podcasts not consist of words?

  3. I’m basically a word guy

    Podcasts are of course made of words, so you are basically a text guy. (So am I.) I think podcasts basically have two audiences: pre-Clarkeans who can’t read while they travel (Arthur C. Clarke said that it would be impossible for him to own a car in its currrent state of evolution: the car would own him), and subvocalizers who can’t read any faster than they could read aloud.

    I certainly don’t think this is an exaggeration of the potential of artificial intelligence in the near future.

    I do. General artificial intelligence, like sustained nuclear fusion, has been ten years away since 1950.

  4. I think it’s clear he means the written word. Which then begs the question (yes, I know, I know…) why does radio work for him? I think we can not read into it so much and move on.

  5. I’m confused, do podcasts not consist of words?

    Sorry for the inexactitude; I’ve emended the sentence to clarify.

    I certainly don’t think this is an exaggeration of the potential of artificial intelligence in the near future.

    Good lord, of course it is. There’s not going to be that kind of AI in my lifetime, and maybe never.

  6. It will become possible to translate any smartphone app or any biochemistry textbook into Basque very quickly.

    I don’t know Basque, so I can’t judge quality of machine translation from English into Basque, but it is easy to check how well Google Translate copes with translation from Basque into English.

    Here is the result of Google-translating Basque Wikipedia article on biochemistry:

    “Biochemistry is a science that investigates the chemical suppositories of living organisms. These chemical supplements are called biomolecules: glucose, lipids, proteins, nucleic acids, vitamins, etc. In all these biomolecules there is carbon, which is the basic element of organic chemistry. But besides carbon, the molecules of living beings are mainly composed of hydrogen, oxygen, nitrogen, phosphorus and sulfur.
    Biokimics also investigates all the chemical reactions that are taking place in the biomolecular structure. The set of these chemical reactions is called metabolism.
    Biochemistry explores the chemical bases of life. All living organisms exchange materials and energy with the environment, and they focus on this exchange through some chemical reactions of the metabolism. Biochemistry investigates the internal reactions that occur within the cell and the internal cellulose organisms.”

    Surprisingly well, it turns out. Back in college, we had translated textbooks of much worse quality.

  7. “There’s not going to be that kind of AI in my lifetime, and maybe never.”

    Do you really think so? A lot of people would disagree with you. Of course it is difficult to separate the hype from what is real. The technology companies have an interest in making claims even when they aren’t fulfilled (they can always say they’re on the verge of success).

    I’m a radiologist, every radiology congress nowadays has numerous lectures about how soon and in what way computers are going to surpass radiologists. It might be exaggerated, but they take it very seriously. Although it’s very different, I doubt that what radiologists do is easier from an AI perspective than translation. After all, Google translate is already pretty good (even though of course it’s easy to find examples of ridiculous mistakes it makes).

  8. Lars (not the original one) says:

    Hat, in the olden days there were courses like “Russian for scientists”. The fact is, there is a big difference between literary translation and working with restricted domains such as scientific texts or weather forecasts (to pick a very small one).
    Where I live, the names of bus stops are now spoken by speech synthesizers. With few exceptions, it works well.

  9. A lot of people would disagree with you.

    A lot of people believe all sorts of nonsense. If you really doubt that what radiologists do is easier from an AI perspective than translation, you should try your hand at translation. And not of biochemistry articles or radiology textbooks, which are after all written in languages only because it can’t all be expressed in symbols, but of literature or conversations. Language is immensely complicated, and for AI to handle it properly would require actual human intelligence, which (as I say) is not going to happen in the foreseeable future.

  10. marie-lucie says:

    Biochemistry is a science that investigates the chemical suppositories of living organisms.

    Most of the paragraph quoted looks fine to this non-chemist, but is suppositories correct in this context?

  11. I wonder how much of the semantics, and maybe even some of the syntax, of that Basque text is Spanish under the covers. I suspect quite a bit.

    See also Lameen’s post on why having “no word for X” can matter.

  12. Lars (the original one) says:

    On the other hand, neural networks (which are only intelligent in the sense that a slime mold is) have been very successful in domains where most people fifty — or ten — years ago assumed that general artificial intelligence would be needed.

    So if all you need to travel the world as a monoglot Dane is a glorified slime mold in your smartphone, the game has indeed changed.

    (I still don’t like the new Google Translate, but that’s mainly because the user interface has been designed by a human to suppress all alternatives).

  13. Not much.

    Here is first sentence in the original. Try to find “Spanish under the covers”:

    Biokimika izaki bizidunen osagarri kimikoak ikertzen dituen zientzia da. Osagarri kimiko hauek biomolekula izenekoak dira: gluzidoak, lipidoak, proteinak , azido nukleikoak, bitaminak, etab. Biomolekula horietan guztietan karbonoa dago, hau baita kimika organikoaren oinarrizko elementua. Baina karbonoz gain, izaki bizidunen molekulak hidrogenoz, oxigenoz, nitrogenoz, fosforoz eta sufrez daude batez ere osatuta.

  14. Considering that I know neither Spanish nor Basque, here’s what I see there (not looking at the English):

    Biochemistry izaki bizidunen osagarri chemistry ikertzen dituen science da. Osagarri chemistry hauek biomolecular izenekoak dira: sugars, lipids, proteins, nucleic acids, vitamins, etab. Biomolecular horietan guztietan carbon dago, hau baita organic chemistry oinarrizko elements. Baina carbon gain, izaki bizidunen hydrogen, oxygen, nitrogen, phosphorus molecules eta sufrez daude batez ere osatuta.

  15. marie-lucie says:

    JC: If you translated the paragraph into most European languages, you would find very similar technical words, so it is not just Spanish under Basque, it is Greco-Latinish under Spanish, English, French, Italian, etc.

  16. I agree that AI is often wildly overhyped. Try asking any of the neural net fans exactly what their black boxes are doing– they have no idea.

    Back when I was an undergraduate, I attended a lecture by a famous AI specialist. He strode to the blackboard, drew a big ‘N’ on the board, and then drew a big box around the ‘N’. He said: ‘We start with– Nature’.

  17. I wasn’t talking about literary translation here — I don’t mean that the next English translation of The Brothers Karamazov will be done by machine.

    Already today it’s possible to successfully chat on the internet with someone via translation software rather than use a common language that neither knows well, as long as both parties are aware that idiomatic expressions might be misunderstood. It depends to some degree on what languages are in question.

    I have done some informal translation work in my life, not literature, but maybe I can say I “tried my hand at it”. Or at any rate I can say that the concept of translation is familiar to me. I’m bilingual, the task of translating conversations does not exactly faze me.

    Do you know anything about radiology? Have you tried your hand at it? Do you expect it’s easy? Maybe one might think that radiology is likely to be a easier task for AI because it isn’t as natural to the human brain as a linguistic task like translation. I doubt either of us has the expertise required to evaluate how the validity of that statement. But a free online tool like Google Translate is better at translation than any existing AI that I am aware of is at radiology.

    Translation does not require “actual human intelligence,” anyway.

    Look at
    https://blogs.microsoft.com/ai/machine-translation-news-test-set-human-parity/

    and

    https://www.microsoft.com/en-us/research/blog/microsoft-researchers-achieve-new-conversational-speech-recognition-milestone/

    Okay, it’s Microsoft’s own website, so maybe it’s somewhat exaggerated. I don’t know, but I don’t they actually lying.

  18. I wasn’t talking about literary translation here

    No, of course not; once we start talking about literary translation, the whole idea falls apart. You can only imagine such a thing if you’re focusing on things that barely need translation, like scientific articles; in that case, sure, Google can do it, but to call that “AI” is to set the bar so low as to be meaningless.

    Do you know anything about radiology? Have you tried your hand at it? Do you expect it’s easy?

    Of course not; I simply expect that machines will be able to do it long before they can do literary translation. I could be wrong now, but I don’t think so (to quote Randy Newman).

    But a free online tool like Google Translate is better at translation than any existing AI that I am aware of is at radiology.

    Doubtless, but that’s completely irrelevant.

    Translation does not require “actual human intelligence,” anyway.

    Well, that’s what you have to believe in order to believe that AI can do it. We’ll have to agree to disagree.

  19. Why does Lameen’s post link here?

  20. Well, that’s what you have to believe in order to believe that AI can do it. We’ll have to agree to disagree.

    Oh, I don’t believe that machines will never be able to do literary translation. Humans are machines, and some of us can do it, after all. I just don’t believe that we can make such a machine (except by biological reproduction) in the foreseeable future.

  21. The idea of my original comment was to speculate that machine translation will perhaps improve the chances that languages like Basque have of surviving. I was wondering whether anyone else has any ideas about this. In the podcast, one of the problems with Basque was said to be that by the time smartphone apps and Playstation games are translated into Basque, the versions have become outdated. Computers obviously have the potential to speed up that kind of translation work considerably and one day even to do it more or less autonomously.

    “No, of course not; once we start talking about literary translation, the whole idea falls apart. You can only imagine such a thing if you’re focusing on things that barely need translation, like scientific articles; in that case, sure, Google can do it, but to call that “AI” is to set the bar so low as to be meaningless.”

    Another thing that people worry about is that even national languages like Swedish suffer domain loss because all technical and scientific writing is done in English. Here also machine translation could potentially counter this trend. If scientific articles “barely need translation,” then it makes sense, doesn’t it?

  22. Even compared with other technical texts, Wikipedia articles are not a good idea for testing translation quality, because there are often many parallel texts in multiple languages. You are basically running the algorithm on training data.

    Yet, even so, the Basque to English translation contains several errors that are trivial for a human to correct. Neither “suppositories” nor “supplements” are right, and the translation produces these two words (cognates, but not that close in meaning) where the structure of the text most naturally calls for the same word to be repeated twice. Even more glaring, the algorithm fails to translate one instance of “biokimics,” even though the very subject of the article is biochemistry.

    Of course, natural language processing is one of those areas where the human brain is exceedingly well tuned. So, by the way, is radiology. While we did not evolve directly to read x-ray films, image processing and three dimensional visualization are areas where human skill is so far beyond that of machines that it is difficult to give a useful comparison of the two.

    Doing integrals, on the other hand, is something computers have mastered. Moreover, it is not just a brute force challenge, but requires some guess and check. By twenty years ago, Mathematica could evaluate definite integrals that no human ever could. And the integration algorithm is itself written by another computer program, just like the androids in Westworld. The current integators are still improving at a rapid rate. Expressions that were in intractable a few years ago can now be simplified, and within the foreseeable future, boundary value problems for partial differential equations (which, like integrals, benefit from massive computational capacity, yet also require some art in selecting the right technique to use) may become a similarly solved domain.

  23. David Marjanović says:

    On the difficulties of machine translation, some of these items apply.

    I think podcasts basically have two audiences: pre-Clarkeans who can’t read while they travel (Arthur C. Clarke said that it would be impossible for him to own a car in its currrent state of evolution: the car would own him), and subvocalizers who can’t read any faster than they could read aloud.

    There’s a third: multitaskers who can listen to something and focus on it well enough while doing household chores. So, not me. I practically never listen to podcasts because I can’t do anything else at the same time – and yet they don’t occupy me enough that my mind wouldn’t wander.

    Why does Lameen’s post link here?

    Probably the <a> tag was empty.

  24. Supposing we stay away from buzzwords—what’s “AI” as opposed to any other software?—still, can computers help in language maintenance, of Basque or any other beleaguered languages?

    I imagine two scenarios, both unconvincing. One is that knowledge of majority languages (say Spanish) would be less essential, because adequate translation from the minority language (say Basque) would be instantly available, and Basque speakers would be less pressured to shift. The other is that second-language (Basque) learners could get instant translation of difficult phrases from Basque to (Spanish), something like a more efficient pocket dictionary).

    Either case seems like something potentially useful, but not a game changer.

  25. Speaking of radiology, a real life familiar example of computer-assisted radiology is that of luggage radiology in airports, where software assists in coloring areas of interest in the radiogram.
    Medical radiology is different, in that the stakes are higher, and a radiologist is directly liable for anything they might miss, which is a matter of life and death far more often than in luggage radiography. The assistance is welcome, but it can’t be the last word.

  26. Yes, as long as we stick to “helpful when supervised by humans” we can go a long way with AI (if we must call it that). It’s when we start talking about equivalence with humans (or superiority) that I jib.

  27. Kristian, as a doctor (training to be a GP with a lot of emergency medicine work and exams on the side; five years post graduation) with a background in software development (first degree was very heavy on computational linguistics, which was a big AI area of interest before the AI winter of the late 80s and 90s) and multilingualism (a certain amount of actual professional translation work for money), I have a few comments on what you said:

    — The Basques are not going to stop learning Spanish, Spanish is a major world language that is a practical necessity in life in the Kingdom of Spain. Smartphone apps and biochemistry textbooks in Basque would be a nice-to-have, but they cope fine with Spanish, and better than their monolingual compatriots with English. The people who would benefit most from immediate machine translation are monoglots who speak a language that no-one wealthy speaks; and of course, monoglot Bengali peasants have a) poor software support for their languages and b) not many smartphones

    — It’s already possible to travel the world without knowing any English or foreign languages, you just need a lot of money and to be prepared to be pay someone to interpret for you and/or be ripped off. The Japanese saw a lot of the world in the 1980s and 1990s!

    — It would be much more realistic to outsource radiology work to countries with a low cost of living, than for AI to take it over. It would be perfectly practical for most diagnostic reads of CTs and MRIs in my part of the world (and probably yours) to be done in Pakistan. But it hasn’t happened, and isn’t happening. Which leads to the next point:

    — Your value to the hospital (or to your health service) is in large part that you are a credentialed professional who can take legal responsibility for his interpretation of the images. No AI company is going to do that in the next three decades. Look at the computer ECG interpretations; if they say ‘Normal Sinus Rhythm’ these days it is always a reassuring finding, but they’re not standing behind it from a legal perspective. And the machines have been interpreting ECGs for decades.

    — So, yes, to be honest, saying AI is going to replace radiologists is hubris. It may increase throughput; the limited question of ‘is there a new intracranial bleed on this CT?’ would likely lend itself well to analysis by AI, for sign-off by a radiologist. But hospitals and health systems will still want a radiologist’s sign-off, certainly for the next three decades or so. And there are any number of more complicated clinical questions that won’t lend themselves to automation.

    — The paid translation I have done has been of patents, from German to English. I found corpus tools an excellent support for this, but machine translation was of limited benefit. Someone I am closely acquainted with did more general paid translation between other more minority languages, usually with involvement of English, and she tended to start from the Google Translate version and edit until she was happy with it. I get the impression a lot of paid online translation is done in this way, given the agreements she had to sign that she would not use online machine translation in her work.

    — I do the usual superficial radiology work expected of a generalist, I read my own chest X-rays and extremity X-rays, I look through the CTs to make sure there is nothing gross that I can pick up in case (God forbid) the radiologist is in the middle of a difficult divorce or something and reports a definite bleed as normal. I also do a reasonable amount of low-risk obstetric ultrasound and will pick up gall bladder stones or free abdominal fluid in the right clinical context.

    — From my perspective the work involved in the two (radiology vs. translation) is comparable, but the legal and social context is not. A little bit like when I’m home on the farm and read the vet column in the Farmers Journal, where the pathology is in large part identical and the decisions made are completely different!

    The AI people were promising something like today’s Google Translate thirty years ago, and it wasn’t their techniques that were used for it in the end. If someone is using ‘artificial intelligence’ to sell their product in all seriousness, invest in the other guy.

  28. Athel Cornish-Bowden says:

    Biochemistry is a science that investigates the chemical suppositories of living organisms.

    Most of the paragraph quoted looks fine to this non-chemist, but is suppositories correct in this context?

    As a biochemist I agree that the translation is surprisingly good. Suppositories is indeed wrong, and, more obviously Biokimics is also wrong. I thought that the Basque word Biokimikak might be a calque of Spanish Bioquímicos (Biochemists). Apparently it’s not, but the meaning is clear. The Basque seems to be translated from the English rather than the Spanish.

    There doesn’t seem to be much correlation between the closeness of the language to English and the quality of the translation that Google Translate produces. A few years ago someone commented (here, I think) that it does a remarkably good job with Hausa, and on checking this myself I found it to be true — I don’t know any more Hausa than I do Basque (none at all), but translations from the BBC’s Hausa service are always intelligible. On the other hand Google Translate does a remarkably poor job with French and Spanish (at least, it did a few years ago when I was checking the Hausa) making elementary errors.

  29. Aidan: Very knowledgeable and convincing, thanks!

  30. The word Google translates as “suppository” and “supplement” is osagarri, “component”. The word is derived from oso “whole” and can mean “medication” or “medical treatment” as well as “element” or “component”. I’m guessing what happened here is that the algorithm has seen the word “suppository” used in human-translated sentences that have osagarri in the corresponding Basque version (because medications are sometimes delivered this way) and so it thinks “suppository” is a possible translation of osagarri.

    With “biokimics”, the problem seems to be that the Basque suffix -ak marks both absolutive plural and ergative singular, (It’s ergative in this case.) The algorithm correctly puts the translation of biokimikak at the beginning of the English sentence because it’s the subject, but it also wants the English translation to have an s on the end, so what it does here is invent a Basque-English hybrid word, “biokimics”, even though it correctly translates the absolutive form biokimika as “biochemistry” elsewhere.

    Google’s translations are getting better, and this translation is surprisingly coherent, but a fundamental problem is that, unlike a human translator, the algorithm doesn’t know what the words mean, so problems like this are inevitable even when it gets better and better at parsing syntax.

    Side note that has nothing to do with AI: The phrase izaki bizidunak “living things” used several times in the article (“beings that have life”) has the same structure as the word euskaldunak discussed in the World in Words episode, “(people who) have the Basque language”.

  31. . On the other hand Google Translate does a remarkably poor job with French and Spanish (at least, it did a few years ago when I was checking the Hausa) making elementary errors.

    It got much better recently with these languages. With Spanish, it’s now so good you can safely enjoy google-translated novels.

    Example:

    – Six shields. That should be enough.
    The friar shrugged, as if to say that no amount was too much when he was handed over to a servant of God. He returned to the orphanage, where he called two other younger friars, who came to take care of the boy.
    The commissioner reassembled, but when he was going to start the old man grabbed the animal’s mouthful.
    – Wait, your honor. Who should I tell him that he is his savior, so that he may take him into account in his prayers?
    The man was silent for a moment, his eyes lost in the tenebrous streets of Seville. He was about to refuse to answer, but had gone through too many bad drinks in life, too many trials and disappointments to waste a prayer in exchange for his six shields. He turned his sad eyes to the friar.
    – Tell him to pray for Miguel de Cervantes Saavedra, the king’s supply commissar.

  32. Craig: Thanks very much for that informative comment — I love that someone who knows Basque can show up and help out!

  33. Lars (the original one) says:

    Another thing that GT cannot do yet is adjusting for variant / old fashioned spellings — as we have seen several times. I occasionally try to talk to a French person who insists on using GT and message in English, but whose spelling is such (lacking distinctions between homonyms) that the only way to make sense of the resulting hash is to guess at the pretranslated text and mentally adjust it to ‘proper’ French.

  34. marie-lucie says:

    A few years ago someone commented (here, I think) that [Google Translate] does a remarkably good job with Hausa, … On the other hand Google Translate does a remarkably poor job with French and Spanish (at least, it did a few years ago …making elementary errors).

    This could have been because of the human translators working for GT. Hausa is not commonly taught to non-Africans and as a result the translators must have been native Hausa speakers also highly educated in English. But for Spanish and (probably especially) French, the translators must have been native English speakers with a good but not native-like knowledge of those languages. If they have greatly improved the Spanish translations, they must be using better qualified translators. I can’t say anything for the French counterparts as I try not to read English-to-French translations.

  35. Craig, I wonder if you can explain the end of this paragraph from a piece about translations of Shakespeare into Basque dialects:
    It was in such a situation that Toribio Altzaga came to the fore. Like Soroa, he was born in Donostia-San Sebastian and was the founder of modern Basque theatre and managed to raise the level of the theatre in Gipuzkoa without taking away its folk character. He came up with numerous comedies in that respect and translated several more such as Pierre Loti’s Ramuntcho or [sic] Shakespeare’s Macbeth. That was undoubtedly the first version of the renowned English playwright’s work ever translated into Basque and was published under the title Irritza.

    Why Irritza? I get irritate for that, from Google Translate.

  36. According to my Basque dictionary, irrits is ‘ardent desire, passion, lust; ambition.’

  37. Passion would be a pretty good title for Macbeth.

    It’s important not to use machine translation on the outgoing side if you can help it. It’s no courtesy to someone who uses another language to try to send them a text machine-translated from your language, because you may not be saying what you mean to say. Better to write in your own language and let the recipient use human or machine translation or their own knowledge of your language.

  38. irrits is ‘ardent desire, passion, lust; ambition.’
    Aha. I knew there must be more to it. I suppose Mac- is problematic for anyone who’s having a go at translating the title.

    Passion would be a pretty good title.
    McLust

  39. There’s not going to be that kind of AI in my lifetime, and maybe never

    Given an invention that is “always 10 years away”, most experts are going to continue to say so even at the point when someone in fact already has a working prototype in their garage and is three days away from unveiling it (or, perhaps, has already unveiled it to a great reception in local papers); which is to say: “experts say is X years away” is not much more than a standardized phrase meaning “we have no idea”.

    (“I don’t think most of the discourse about AGI being far away (or that it’s near) is being generated by models of future progress in machine learning. I don’t think we’re looking at wrong models; I think we’re looking at no models. (…) In reality, the two-year problem is hard and the ten-year problem is laughably hard. The future is hard to predict in general, our predictive grasp on a rapidly changing and advancing field of science and engineering is very weak indeed, and it doesn’t permit narrow credible intervals on what can’t be done.”[1])

  40. Why did Shakespeare call it Macbeth?

    Edited on realizing that it’s very loosely based on a historical figure.

  41. Well, the title in the First Folio was “The Tragedie of Macbeth”; who knows what Shakespeare called it?

  42. Stu Clayton says:

    The name of the song is called “Haddocks’ Eyes.”‘

    `Oh, that’s the name of the song, is it?’ Alice said, trying to feel interested.

    `No, you don’t understand,’ the Knight said, looking a little vexed. `That’s what the name is called. The name really is “The Aged Aged Man.”‘

    `Then I ought to have said “That’s what the song is called”?’ Alice corrected herself.

    `No, you oughtn’t: that’s quite another thing! The song is called “Ways and Means”: but that’s only what it’s called, you know!’

    `Well, what is the song, then?’ said Alice, who was by this time completely bewildered.

    `I was coming to that,’ the Knight said. `The song really is “A-sitting On A Gate”: and the tune’s my own invention.’

  43. But of course the penultimate statement is wrong, as Martin Gardner points out: “the song really is” should be followed immediately by the Knight singing it, not by specifying yet another name.

  44. Stu Clayton says:

    “Should” shoots off on a captious tangent. Carroll is dicking around with idioms. The mutually interfering semantics can be mulled over by a prying mind, if there is such a mind on the premises.

  45. January First-of-May says:

    IMHO, there is a song reference that would fit in after “the song really is…” – the default title in the absence of others, which for a song (or poem) is well-established as the first line; in this case, “I’ll tell thee everything I can…”
    (Sure enough, the song that Alice recognizes it being the tune for is referred to by its first line.)

    Apparently some collections do refer to it as such. Wikipedia, of course, has it under “Haddocks’ Eyes”, which, I believe, is the more common option.

  46. The fascination of the non-linguists with the survival of the Basque hit a fever pitch with the recent publication of the genetic transect of ancient and modern Spain, which revealed that while most Spaniards are product of admixture of North African and Mediterranean elements (which started a few centuries before the Romans and accelerated in Roman times), the Basque retain a formerly pan-peninsular Iron Age genetic makeup, which is itself a legacy of massive Bronze age migration of the descendants of Steppe pastoralists. The Bronze age migrations replaced nearly half of the region’s DNA (and almost all of its Y-chromosomal DNA, essentially obliterating the male lineages of the earlier era).
    But there is a catch. Aren’t the male-dominated migrations of the descendants of the Steppe warriors supposed to spread Indo-European languages? Why the link with the Basque?

    watch the perplexed DNA aficionados here:
    http://eurogenes.blogspot.com/2019/03/open-thread-what-are-linguistic.html

    (an added bonus: Carthaginian colonies near today’s Barcelona seem to be genetically similar to the Mycenaeans)

  47. SFReader says:

    I toyed with idea of very late Indo-Europeanization of Western Europe before.

    Main idea – it started during Late Bronze Age collapse (circa 13th century BC), not much earlier. And the process was completed with the conquest of Ireland by Celts from Britain during Roman times.

    This gives us plenty of time to assign early Kurgan invaders to, say, Basques (let’s make their homeland North Caucasus as usual).

  48. David Marjanović says:

    But there is a catch. Aren’t the male-dominated migrations of the descendants of the Steppe warriors supposed to spread Indo-European languages? Why the link with the Basque?

    I’d say IE-speaking men married into Pre-Aquitanian communities one by one, without bringing their whole culture with them.

  49. I’d say IE-speaking men married into Pre-Aquitanian communities one by one
    given that 40% of the overall gene pool have been replaced, and as much as 100% of the Y-chromosomes, it’s hard to imagine a trickle of migrant men being gradually assimilated as they move in. A better hypothesis might be an exceptional societal situation where mothers’, rather than father’s, language is retained despite the power of the males … or, of course, a hypothesis that the Bronze-age, male-dominated migrations were what brought the Basque languages into the Pyrenees area.

  50. David Eddyshaw says:

    an exceptional societal situation where mothers’, rather than father’s, language is retained despite the power of the males

    It’s happened pretty often in the bits of West Africa I know; ethnic group membership is patrilineal, but people grow up speaking their mother’s language. I was first given a Kusaal Bible by a Mamprussi colleague who spoke only Kusaal and no Mampruli; he was quite typical of local Mamprussi. The whole area is claimed by the Mamprussi historically, whom the British (Lord love ’em) decided were the traditional rulers after they (the Brits) annexed the region. Actually consulting the locals would only have been confusing, and would presumably have set a very bad precedent …

  51. John Cowan says:

    The whole area is claimed by the Mamprussi historically, whom the British (Lord love ’em) decided were the traditional rulers after they (the Brits) annexed the region. Actually consulting the locals would only have been confusing, and would presumably have set a very bad precedent …

    Perhaps on the contrary the Brits did consult them and got either of: “Oh yes, of course we are the traditional rulers!” or “Well, those people claim to be the traditional rulers, yes.”

  52. Probably the worst British colonial mistake was in India where they mistook the zamindar tax collectors for landowners.

  53. Trond Engen says:

    Dmitry: A better hypothesis might be an exceptional societal situation where mothers’, rather than father’s, language is retained despite the power of the males … or, of course, a hypothesis that the Bronze-age, male-dominated migrations were what brought the Basque languages into the Pyrenees area.

    Thought-provoking.

    I don’t remember the DNA stuff in enough detail to give dates and percentages, but recent papers have shown that there was a genetic switcheroo on the Steppes. The original Pitgravian Kurganists were R1b. R1a were at the time in the Old European cultures of Poland and thereabouts. When Steppe populations moved into Globular Amphora and formed Corded Ware, they incorporated some R1A, and when Corded Ware expanded, the branch spreading north and east through Russia, and eventually back to the Steppe, was predominantly R1A, while the branch staying in Northern Central Europe and going westwards and contributing to Bell Beaker was R1B. It’s almost tempting to propose that Yamnaya was Pre-Proto-Basque and Indo-European came from Globular Amphora. Except that this will hardly work for Balkan IE and not at all for Anatolian.

    But the standard story doesn’t work for Anatolian either. So maybe Yamnaya was a multilingual confederacy. Basque may have come from the original Steppe population and Indo-European from the Majkop element. With time and drift the two got sorted apart, with the purely R1b Basque-speaking clans becoming dominant in westernmost Europe and the somewhat more diverse Indo-Europeans gaining advantage in central and eastern parts.

    I’m generally more eager to see Pre-Proto-Basque as the lingua franca of the Atlantic coast. Could the Indo-European intruders have taken over the trade networks and switched to the trade language in the process? But how could that lead to a complete genetic replacement in the male line?

  54. Stu Clayton says:

    By a simple process: the local women prefer to marry the powerful intruders, leaving the local male gene pool behind. The train switched lines, but still tooted in Pre-Proto-Basque.

  55. Trond Engen says:

    Yes, but that’s a preference. A preference is probabilistic and shouldn’t lead to 100% replacement in a large population.

    Maybe Proto-Basque/Acquitanian(/Iberian) survived the Indo-European intrusion as the lingua franca of the trade on the Atlantic coast and across the Aude-Garonne isthmus, and later replaced the language of the invaders along both coasts of Iberia. That might be similar to how Finnic established itself on the Baltic coast after Indo-European Corded Ware.

  56. David Marjanović says:

    Except that Finnic came in from the east after the Corded Ware.

  57. Trond Engen says:

    Me. A preference is probabilistic and shouldn’t lead to 100% replacement in a large population.

    100% replacement across the population, not only on elite level, is incredibly brutal. After killing, castrating or enslaving all males*, the conquerors install themselves as masters of every household. I was going to say that I can’t see that happening without a dominance that must lead to language replacement, but actually I can. One way would be if the conquering males came, killed, castrated or enslaved all the men and boys, raped the women, and left. Repeat for a couple of decades, and there’s not an indigenous Y-chromosome left. Another way would be if the indigenous population were kept as chattel slaves until they managed to revolt or the conquerors gradually loosened their grip. Or the incomers may have brought a contagious disease that disproportionally affected the males, but I can’t see that either yielding 100% replacement without help from a brutal policy.

    *) Viricide? Maricide?

  58. SFReader says:

    Island Carib language is such an example. Caribs from mainland invaded the islands, killed (and possibly ate) all the native male Arawaks taking their women.

    Result – Island Carib language which is actually Arawakan.

    A few centuries later the history was repeated again and escaped black slaves replaced all male Caribs and took their women resulting in Garifuna people – African in appearance, but native speakers of an Arawakan language callled Garifuna (Carib).

    Two nearly total replacements of the entire male population, but the language still manages to survive.

  59. Trond Engen says:

    David M.: Except that Finnic came in from the east after the Corded Ware.

    Yes, I made my point badly. The Eastern Baltic probably became IE with Corded Ware. Finnic was the language of the Volga-Ural bronze trade and replaced it with little genetic change in the regions most intimately connected to the trade. Iberia probably became largely IE with Bell Beaker. Proto-Baltic-Acquitanian was the language of the Atlantic tin (etc.) trade and replaced it with even less genetic change in the regions most intimately connected to the trade. In this scenario the languages spoken in Iberia before the arrival of IE may or may not be related to Basque.

  60. Stu Clayton says:

    Trond:
    Me: By a simple process: the local women prefer to marry the powerful intruders, leaving the local male gene pool behind.
    You: Yes, but that’s a preference. A preference is probabilistic and shouldn’t lead to 100% replacement in a large population.

    By “they prefer to marry” I meant “they all in fact married” – I wrote “prefer to”, but I could as well have written “were forced to”.

    My short post was obviously snarky. But, leaving that aside, I don’t at all see your point about “probabilistic”. The whole discussion above is as full of speculation as a Christmas turkey of stuffing. Speculation means “could be, could not be”, depending on the evidence adduced and how it is interpreted – and thus is inherently “probabilistic”. Why are blog posters allowed to be “probabilistic”, but not those about whom they speculate – “males”, “mothers” etc ? My “local women” is no different from these categories to which others refer above.

  61. Trond Engen says:

    Right. I took your snark to imply “Women have free will. Nothing much to explain.” By probabilistic I meant to add that a preference in its normal sense is a tendency. It could be a strong tendency, but it’s still just moving the point of equilibrium in the gene pool. Add random selection processes and variation will decrease in small populations over time. But it’s not enough to explain a full and sudden replacement of the male line in a large region like Iberia. In other places, where the replacement was less thorough, the mechanism may well have been (some definition of) preference.

    Also, there are lots of ugly scenarios that can explain a full and sudden replacement of the male line. The difficult thing to explain is full replacement without language shift. I came up with a couple of ugly scenarios for that too, but I agree that they are speculative.

  62. John Cowan says:

    Well, our native language is called our mother tongue for a reason. 100% replacement of males seems like it should almost never lead to language shift.

  63. Trond Engen says:

    If this was about a local community, I think there are examples both ways, but this is about foreign males installing themselves as a dominant elite and the only reproducing males in a large region. I’d expect clear advantages in knowing the men’s language as well as a huge difference in power between sexes and communities.

    Contrast this with the situation not many generations earlier. when IE speakers came to Globular Amphora and formed Corded Ware. I don’t know how different the power structure was, but there was far from 100% genetic replacement of the male line. They even got genetically outnumbered on the elite level by local males in what was to become a ring-migration back to the Steppe.

  64. A culture in which the males are traveling raiders, traders, or herders (away from home a significant fraction of the year) while the women stay in place and farm would naturally lead to the mothers’ speech becoming the surviving standard. That kind of male lifestyle is also one that could easily support a large but entirely male group moving in and conquering an area.

  65. David Marjanović says:

    100% replacement across the population

    It says “nearly 100%” in the abstract, so it’s entirely possible that the original influx of Y chromosomes was much less than that, and the minority haplotypes then died out stochastically as expected in a population that doesn’t grow too fast. In keeping with this, the total replacement “of Iberia’s ancestry” was just 40%, not 50.

  66. David Eddyshaw says:
  67. Trond Engen says:

    I should of course have read the full paper before speculating, but I haven’t been able to google up a preprint. I did read the abstract, though. True that the 40% leaves room for a few local males, so not necessarily full replacement of males except in the straight male lines, but still very close*. Maybe more importantly, I think my description of the replacement as sudden may be wrong. I first read the abstract as stating a sudden replacement at ca. 2000 BCE, but it just says it started some time after 2500 BCE and was completed by 2000 BCE.

    *) It’s dangerous making off-hand mathematical statements in this company, but the percentage is actually very high. If the expansion south- and westwards into Iberia happened gradually, each generation of males would carry less Steppe ancestry, even if their Y chromosomes were pure R1b. I see two ways the percentage could reach 40. (1) A significant number of the intruders were women, and the sons and daughters of marriages to IE women had a reproductonal advantage for some time. This would speak for an IE language community with power and prestige. (2) A stream of new arrivals from up north until the whole peninsula was conquered. This would suggest continuous cultural contacts with the cousins back home. (My first thought was (†) The percentage is so high it could be within what would be expected if boys were murdered and girls allowed to grow up at the time of a single takeover event, but now I think I got that wrong.)

  68. @Trond
    Maybe Proto-Basque/Acquitanian(/Iberian) survived the Indo-European intrusion as the lingua franca of the trade on the Atlantic coast and across the Aude-Garonne isthmus, and later replaced the language of the invaders along both coasts of Iberia.

    The situation with the “/Iberian” makes me think that the unusual resilience / matrilocality of the Basque had nothing to do with the fact that Basque “wasn’t replaced by the Indo-European languages” back in the Iron Age.

    Iberian is also non-IE, and somewhat related to the Basque, and it was spoken all across the Med coast for about 1300 years after the same wholesale DNA change as transpired in the Basque lands. Yet Iberian readily gave way when Carthaginians and Romans moved in, and was extinct by II c. CE.

    If Iberian predated the wholesale R1b / Steppe-origin DNA replacement but survived because the society was uniquely poised to withstand the IE onslaught which replaced local languages everywhere else in Europe (and indeed in much of Asia) … then why did it crumble so fast during cross-Mediterranean colonization?

    I would rather suggest that BOTH Iberian and Basque were invasive, and indeed related. A relative safetyin the mountainous refugia may have helped Basque survive…

    just 40%, not 50
    ditto R1b, reaching “just 80%” (fully consistent with 40% overall DNA). But from the publication it appears that the shift to R1b took a mere century. It is a bit too abrupt a change for a gradual process of selection through bridal preference of more powerful non-local males. Such preference would have needed to have a nearly 2-fold bias in each of the subsequent generations against local males to have reached 80% replacement in a century.

  69. Reading the Olalde et al 2019 paper more thoroughly, I see an additional piece of evidence against “unusual linguistic resilience of the Basque” hypothesis.

    It turns out that some Basque territories experienced an additional pulse of invasion of peoples originating far to the North-East, of a comparable albeit slightly smaller DNA magnitude – and ended up changing their language. It happened in the middle of 1st millennium BC (even before the Iberian non-IE languages of the Mediterranean coast started giving way to invasions and colonization there).

    The locality in question is in Rioja, in the Southern part of the today’s Basque country, and the new language and culture in question is Celtic / Celtiberian. In the IV c. BC it was a frontier town of the Celtiberians, who took this area from the Basques and built hilltop fortifications all across it. The IV c. remains look tell the story of another partial population turnover, with about 30% of the earlier-age DNA being replaced (compared to about 40% replacement during the Bronze Age migrations). The incoming DNA is derived from the Urnfield / Hallstadt Early Celtic peoples of North-Central Europe (consistent with the linguistic and archaeological evidence). (Fig. S6 in the paper’s supplements).

    So it looks like a similar but smaller population influx resulted in the predecessors of the Basques in the Rioja region changing their language to the language of the invaders (a Celtic one). It would be reasonable then to assume that an even larger population turnover in the Bronze Age also resulted in the linguistic change (to the invasive Aquaitanian / Basque / Iberian languages)

  70. Trond Engen says:

    If the genetic change in Iberia took a century, that ‘s four generations. Let’s divide Iberia into five zones of equal population. In the first zone there’s a 100% efficient monopolization of fatherhood. This means that the next generation is genetically 50% NEW, 50% OLD, and 100% R1b. Across Iberia it’s 10% NEW and 20% R1b.

    The children of those who stay and reproduce in the first zone remain 50/50 NEW/OLD. Those men who proceed into zone 2 and monopolize fatherhood beget a generation who are 25% NEW and 100% R1b. At this point the last generation across Iberia is 15% NEW and 40% R1b.

    From here there are several routes, but let’s consider a situation where degree of NEW DNA means nothing and R1b everything for the chance of procreation. Let’s also say that the share of young men moving on is the same as for generation 1. Those now monopolizing fatherhood in zone 3 and 4 will be 37.5% NEW. Their offspring will be 18.75% NEW and 100% R1b. Iberia at large is now 22.5% NEW and 80% R1a.

    What we do see is 80% replacement of the male ancestry and 40% of the total ancestry. This would be the case if the intruders took zone 1-4 In one single sweep or, equivalently, the whole of Iberia with 80% efficiency. No women brought along, just near-monopolizing fathering of children’.

    That’s one big sweep. For a two to four step scenario to work, there must also be NEW women, and the smaller the share of NEW women, the bigger must the advantage of the male offspring of NEW women have been. The balance share without any advantage of being born from a NEW mother is somewhere around 15%.

    This says nothing about language except that a high share of women among the newcomers would mean that both parents in elite families spoke the intrusive language.

  71. Dmitry Pruss says:

    @Trond – figure S7 in the supplements attempts to add data from the X chromosomes to answer the question, what fraction of one’s ancestors were “local” vs. “migrant” on paternal and maternal lines. The most probable answer is, 3/4th of the male ancestors were migrant and almost 100% of the female ancestors were local.

  72. Trond Engen says:

    Thanks. Not too far off then. The difference is differential advantage and other complications I chose to overlook. And it probably happened in one big sweep.

    Is the paper available somewhere?

  73. http://science.sciencemag.org/content/sci/suppl/2019/03/13/363.6432.1230.DC1/aav4040_Olalde_SM.pdf
    ( supplement for paper DOI: 10.1126/science.aav4040 ) not sure about paywalls but there are always solutions like sci-hub in desperate situations?

  74. John Cowan says:

    A relative safetyin the mountainous refugia may have helped Basque survive

    Like the languages and families of the Caucasus, and for the same reason.

  75. The La Hoya Celtiberian site (where the IV c. BC remains described above came from) offers a unique archaeological opportunity because, unlike most of the fortified Celtiberian settlements, it isn’t underneath a modern town. Almost all Celtiberian sites remain occupied to this day, but the residents of La Hoya moved to a more defensible higher hill nearby some 2300 years ago.
    https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0155342
    But one more thing is weird about this place. Apparently not all of the human remains were cremated, as it is supposed to be customary in the Celtic culture. Instead, dead children were buried under the eaves of the houses. Hence the DNA. I wish I could tell to what extent this burial practice is compatible to the hypothesis that the skeletons belonged to the Celtiberians rather than to some subjugated population of a different culture…

  76. David Marjanović says:

    Caucasus, and for the same reason

    Aquitania is obscenely flat, and place names make clear that Basque was expanding southwards in the early Middle Ages.

    Anyway, I’ve downloaded the paper, supp. inf. and “Perspective” article of Olalde et omnes (2019), and will read them… maybe on the weekend.

  77. Trond Engen says:

    The figure on page 86 in the supplementary (thanks, X!) seems to say that 46 out of 47 Y chromosomes from Bronze Age and Iron Age Iberia were R1b. That’s a lot.

    The intrusive population used in the models was German Beaker rather than something closer. This might mean that the estimate of 40% overall is too low. This would also move the 80% estimate for male ancestry closer to 46:47.

    I agree that the linguistically most plausible scenario is that the invaders were Ibero-Acquitanian,. This forces me to consider that Bell Beaker was Ibero-Acquitanian, which smoothly leads back to the idea that the Steppe people forming Corded Ware were Ibero-Acquitanian, and that their Globular Amphora neighbours — who after being assimilated started spreading R1a east and south to India — were the orginal IE speakers. That still doesn’t work, but the swap of Y-chromosomes from Globular Amphora to Corded Ware could still be a clue. What if we imagine a peace agreement between CW and GA with a grand exchange of hostages/foster sons? You raise all our sons and we raise all yours. The R1b sons moving into Globular Amphora homes and eventually marrying Globular Amphora daugters would become fluent speakers of Ibero-Acquitanian, while the R1a sons living in Corded Ware homes and eventually marrying Corded Ware daughters would become fluent speakers of Indo-European. The husband eventually took over his family’s operations, and the two similar but different hybrid cultures spread in different directions.

    On a different note: The scale of the operation in Iberia is unfathomable. Both faster and deeper than anything the Romans did. I wonder what society would be like to produce such a surplus of men.

  78. David Marjanović says:

    Every man either acquires a harem or joins a war band?

  79. Apparently not all of the human remains were cremated
    I couldn’t locate the cited paper on the indoor infant burials of La Hoya, but there is a more recent research paper describing child burials in a mid-1st millennium BC fortified town in nearby Berbinzana, Navarre, also just North of the Ebro river, in historical context.
    https://dadun.unav.edu/bitstream/10171/17736/1/08.%20de%20Miguel.pdf
    The final pages of the paper extensively discussed the prehistoric dual burial system (cremation for older people, in-the-house burial of infants) documented at numerous sites both across the Celtiberian frontier and in the more central Spain location, and described by Pliny the Elder who explained that cremation was reserved for those who died after their teeth came out.

  80. dual burial system (cremation for older people, in-the-house burial of infants)
    And it turns out to be a wider Celtic rite, not just Celtiberian. La Tene period Celtic dwellings in Austria also have indoor infant burials, and across much of Europe, cremated remains of younger children are conspicuously lacking, too
    https://balkancelts.wordpress.com/tag/celtic-burial-ritual/

  81. On the “improper” infant burial customs, I also came across this 2011 study which reviews previous archaeological finds and theories about them:
    https://www.academia.edu/520416/Thrown_out_with_the_bathwater_or_properly_buried._In_M._Lally_A.M._Moore_eds._Re_Thinking_the_Little_Ancestor_New_Perspectives_on_the_Archaeology_of_Infancy_and_Childhood._Oxford_Archaeopress_2011_37-46
    Verbatim: “In the Central European Iron Age, infant burials are largely absent from regular cemeteries, while infant skeletons frequently appear in settlement contexts”. The authors attempt to reconstruct the Iron Age belief systems regarding infant burial by anchoring it in the ethno-linguistic context, and begin from noting the non-random location of the infant burial sites within the settlements (near walls and boundaries), and the geographic distribution of the known high density infant burials (which to their knowledge didn’t include Celtiberian Spain, but definitely included Hallstadt / La Tene-related sites in Germany, Switzerland, France and the British Isles). They cite sources from Livy to Caesar to the Celtic texts on the special / not quite human status of the uninitiated boys and noídenacht (infants who didn’t yet speak), and speculate about “liminality” status of the infants whose status was “at the boundary of the humanity” and whose burial places might have been therefore selected at the boundaries of the family properties. A rather high flying hypothesis, but a wide collection of cited sources makes it a worthwile link IMO.

  82. Trond Engen says:

    Good job on the Celtic archaeology. My thought about children being buried under the eavesdrop was that they would need to be close to their living relatives even after death. Which is just another way to say that the family would want them close.

    Anyway, it’s the kind of signifcant custom that would be defining a culture. It’s very interesting when there’s overlap also with linguistic/onomastic evidence and now genetics. It adds predictive power and falsifiability. (It’s of course even more interesting when things don’t fit and there’s not perfect overlap. But the patterns in overlap helps to understand also the exceptions.)

  83. The problem with learning about the modern state of Celtic archaeology is that the stuff used to be purely the domain of romantic nationalist fantasy, and to a large degree remains there. It’s like getting some real modern science about antioxidants or oxytocin … an unsophisticated search leads to heaps of legends, and it’s hard to see the actual data behind it.

  84. Athel Cornish-Bowden says:

    Island Carib language is such an example. Caribs from mainland invaded the islands, killed (and possibly ate) all the native male Arawaks taking their women.

    Result – Island Carib language which is actually Arawakan.

    I don’t think David Kleinecke comes here, but I know him from alt.usage.english and sci.lang. He is long since retired, but in his younger days he studied Arawak languages. I had supposed that the Arhuacos of the Sierra Nevada de Santa Marta in Colombia were Arawaks, but he says no, they are quite different and the language of the Arhuacos is not Arawak, and, indeed, not related. However, the Caribs you refer to may indeed have come from the mainland, but not from Colombia.

  85. David Marjanović says:

    Anyway, it’s the kind of signifcant custom that would be defining a culture.

    It’s also the kind of cultural trait that could spread easily independently of any others, I would guess.

  86. Trond Engen says:

    Some would think this, some would think that. I’m pretty sure non-elite burial customs are resilient to diffusion. But my main point is that when cultural traits are held together with linguistic and genetic evidence, we should be able to tell with much more certainty which traits are spreading independently and which are not.

  87. I was after something more modest… not a culture-defining trait but merely culturally compatible. Since I was so certain that cremation was the way for the Celts… yet only the remains which escaped cremation yielded DNA. But in the end I came to believe that material culture, language, DNA and indeed infant burial customs all came to Iberian peninsula together, and from a more Central European location

  88. Savalonôs says:

    Basque may have come from the original Steppe population and Indo-European from the Majkop element.

    Excellent. I speculated up a similar model in a conversation elsewhere a few weeks ago. This was actually before I had seen the Olalde paper. There was a paper recently by John T. Koch and I’ve been playing around with ideas related to his model. Koch argues for an early PIE (common ancestor of Anatolian and other IE branches) urheimat south of the Caucasus. The primary evidence is the (still extremely tentative) apparent lack of steppe ancestry in putative Hittite speakers.

    I speculated that, in this scenario, the earliest identifiable IEs might be identified with the ETC a.k.a. Kura-Araxes culture. Maĭkóp seems like the most likely candidate for the intrusion of this language family into the vicinity of the steppes, which would necessitate that earlier steppe populations had a different language. This would then lead to Yamnaya as multilingual confederacy.

    Of course, a Vasconic invasion hypothesis for Iberia doesn’t require that Vasconic goes all the way back to Yamnaya. Any earlier event in which steppe-descended peoples entered a new area in Europe without extreme population turnover could have resulted in them adopting Pre-Vasconic language at that point.

  89. Savalonôs says:
  90. David Marjanović says:

    That’s not a paper, it’s the slides of a conference presentation. However, clicking through to John Koch’s academia.edu page brings up a lot of interesting papers.

    One argument against a late arrival of Basque is the weird substrate layer in Sardinian that contains words practically identical to Basque.

  91. Trond Engen says:

    Dmitry: I was after something more modest… not a culture-defining trait but merely culturally compatible.

    Yes, I understood. It was fun to follow.

    Savalonôs: I speculated up a similar model in a conversation elsewhere a few weeks ago. […] Koch argues for an early PIE (common ancestor of Anatolian and other IE branches) urheimat south of the Caucasus. The primary evidence is the (still extremely tentative) apparent lack of steppe ancestry in putative Hittite speakers.

    I’ve been speculating about Maikop since, well, probably Mallory & Adams. My other alternative scenarios have been dismissed one by one by evidence, but Majkop’s only become more interesting. Though it’s probably a little early to dismiss Steppe ancestry for the Hittites after only five not-necessarily-Hittite-by-ancestry genomes.

    I speculated that, in this scenario, the earliest identifiable IEs might be identified with the ETC a.k.a. Kura-Araxes culture. Maĭkóp seems like the most likely candidate for the intrusion of this language family into the vicinity of the steppes, which would necessitate that earlier steppe populations had a different language. This would then lead to Yamnaya as multilingual confederacy.

    Maybe or maybe not. It could well have been thoroughly Maikopified.

    Of course, a Vasconic invasion hypothesis for Iberia doesn’t require that Vasconic goes all the way back to Yamnaya. Any earlier event in which steppe-descended peoples entered a new area in Europe without extreme population turnover could have resulted in them adopting Pre-Vasconic language at that point.

    The best candidate for that so far is Corded Ware, but I’ll be eager to see a population turnover analysis for France and neighbouring regions in the 3rd millenium BCE. We discussed Olalde 2018 last year. It’s the paper that said that the Neolithic population in Britain and Ireland were almost completely wiped out by Bell Beaker people from Holland or thereabouts. With both extremes covered, I guess they’re building up to a sweeping analysis of Central Europe.

    Here’s the Koch paper

    Thanks. His name is credentials enough for me.

    I’m open to much of it, and would have accepted even more before Olalde 2019. He’s fuzzy on membership in the Satem group, and that’s sensible enough. He doesn’t mention the supposed Graeco-Armenian branch, and that’s also sensible, since otherwise the Ringe-Mallory tree can’t be forced into this geographical straightjacket. I note that the “Celtic” Atlantic Bronze Age is long after the population turnover event in Spain, so it’s not what brought R1b to Iberia. It looks more like a later development or a survival/rebound that failed to vipe Vasconic out from its remaining core area. 1300 BCE is probably also too early for common Celtic, so this must have been Para-Celtic or Para-Italo-Celtic of some kind, if IE at all. Maybe it’s Tartessian. (Also, I should have remembered Koch for his paper on Tartessian a few years ago.)

  92. Trond Engen says:

    David M.: One argument against a late arrival of Basque is the weird substrate layer in Sardinian that contains words practically identical to Basque.

    Just as I was thinking about how to think about that, The Source of All Wisdom spake thus:

    The Beaker culture in Sardinia appeared circa 2100 BCE during the last phase of the Chalcolithic period. It initially coexisted with and then replaced the previous Monte Claro culture in Sardinia, developing until the ancient Bronze Age circa 1900–1800 BCE. Then, the Beaker culture mixed with the related Bonnanaro culture, considered the first stage of the Nuragic civilization.
    Contents

  93. David Marjanović says:

    Though it’s probably a little early to dismiss Steppe ancestry for the Hittites after only five not-necessarily-Hittite-by-ancestry genomes.

    There’s no trace of Steppe ancestry anywhere else in or around Anatolia either up to the Bronze Age.

    Maybe it’s Tartessian.

    Or Lusitanian, whatever that is. 🙂 It would also be great to have longer Pictish inscriptions.

    The Beaker culture in Sardinia

    Nice! Still, it’s noteworthy that the Sardinians (unlike the unremarkable Basques) are the purest descendants of Early European Farmers that can be found today.

  94. @Savalonôs , Koch also promotes an idea of proto-Iberian/Aquitanian/Vasconic influences which distinguish Celtic from other IE branches. Of course Koch didn’t have an advantage of the more recent genetic data. Having to operate on an assumption that Basques were 100% autochthonous / unmixed / developed in situ, he in turn had to hypothesize that Celtic languages spread FROM Spain TO Austria.
    Now we know that early Basques and Iberians were an about 50:50 mix of Steppe-origin males and local females, and it is therefore not unreasonable to assume that their languages were intrusive, perhaps from more Central European locations, or possible even from the Steppes; so Koch’s hypothesized interactions between proto-Celtic and proto-Vasconic may have occurred outside of the Iberian peninsula.
    We also understand that the spread of Celtic languages between Iron Age Spain and Austria went from East to West, so this proposed early interaction between Celtic and Vasconic could not have occurred “in the Atlantic West”.

  95. in this scenario, the earliest identifiable IEs might be identified with the ETC a.k.a. Kura-Araxes culture. Maĭkóp seems like the most likely candidate for the intrusion of this language family into the vicinity of the steppes, which would necessitate that earlier steppe populations had a different language. This would then lead to Yamnaya as multilingual confederacy
    There is also an insurmountable DNA problem around it, far bigger one than the problem of absence of Steppe DNA in the few Hittite-era remains who were probably Hittites but may have been someone else.

    In comparison, great many DNA samples from Yamnaya and their descendants have been studied. And they don’t have Anatolian neolithic DNA component, which pretty much rules out either Maikop or Kura-Araxes (on the latter two, there are extensive data in Wang 2018, doi 10.1101/322347, I will add a link a bit later to avoid moderation queues 🙂 ).

    I already mentioned some of its findings here at LH but obviously in some discussion which dwelled on IE more than on Vasconic 🙂 although we now have a great reason to revisit the Pontic Steppe as a possible home of proto-Ibero-Vasconic, if such a thing existed. In Wang 2018, the genetic antecedents of Yamnaya were found in the mound burials of North Caucasus foothills, in such locations as Progress and Vonyuchka. It caused me many smiles to think that Vonyuchka may yet become a household name of a craddle of Indo-Europeans, because the meaning of the name, “Little Stinker”, is just too funny. The place did stink too much, it’s just outside a fence of a sewage treatment station.

  96. Savalonôs says:
  97. Trond Engen says:

    David M.: There’s no trace of Steppe ancestry anywhere else in or around Anatolia either up to the Bronze Age.

    That’s a good point. Though we might think the Hittite elite were wiped out.

    Dmitry: In comparison, great many DNA samples from Yamnaya and their descendants have been studied. And they don’t have Anatolian neolithic DNA component, which pretty much rules out either Maikop or Kura-Araxes

    As it stands now, we have to accept language spread by cultural diffusion one way or the other. There was more cultural diffusion northwards than southwards through the Caucasus.

    (on the latter two, there are extensive data in Wang 2018, doi 10.1101/322347, I will add a link a bit later to avoid moderation queues 🙂 ).

    Wang et al 2018, Abstract:

    Archaeogenetic studies have described the formation of Eurasian ‘steppe ancestry’ as a mixture of Eastern and Caucasus hunter-gatherers. However, it remains unclear when and where this ancestry arose and whether it was related to a horizon of cultural innovations in the 4th millennium BCE that subsequently facilitated the advance of pastoral societies likely linked to the dispersal of Indo-European languages. To address this, we generated genome-wide SNP data from 45 prehistoric individuals along a 3000-year temporal transect in the North Caucasus. We observe a genetic separation between the groups of the Caucasus and those of the adjacent steppe. The Caucasus groups are genetically similar to contemporaneous populations south of it, suggesting that – unlike today – the Caucasus acted as a bridge rather than an insurmountable barrier to human movement. The steppe groups from Yamnaya and subsequent pastoralist cultures show evidence for previously undetected Anatolian farmer-related ancestry from different contact zones, while Steppe Maykop individuals harbour additional Upper Palaeolithic Siberian and Native American related ancestry.

    the Pontic Steppe as a possible home of proto-Ibero-Vasconic

    Wouldn’t it be fun if the Iberians were all related after all? I went looking for a Steppe element in the Georgians, but no much luck,

    In Wang 2018, the genetic antecedents of Yamnaya were found in the mound burials of North Caucasus foothills, in such locations as Progress and Vonyuchka.

    We did discuss that. You also found some Russian papers that discuss the archaeology of the sites in much more detail.

  98. Where’s a reference for the “Basque-looking” Sardinian substrate?

  99. Trond Engen says:

    here’s the real Koch article

    Thanks. It’s updated with the results of Olalde 2019. On Celtic and Basque:

    The aDNA evidence from the Iberian Peninsula—specifically a widespread low level of the Steppe component with a strong male bias—is consistent with a scenario of substratum influence from the language of mothers. The pattern reflects a situation in which successive generations of men with Steppe DNA were exceptionally successful in producing offspring with indigenous Iberian women, who probably spoke an indigenous non-IE language or languages with a consonant system similar to that of Palaeo-Basque.

  100. men with Steppe DNA were exceptionally successful in producing offspring with indigenous Iberian women, who probably spoke an indigenous non-IE language or languages with a consonant system similar to that of Palaeo-Basque.

    Huh? How do they know from DNA what language someone’s mother spoke, never mind what was its consonant system? Is this another ‘vegetables cause labiodentals’ hypothesis?

    Do we have archaeological evidence of the speakers of Paleo-Basque? Did they scrawl on potsherds or weave into fabrics ‘Paleo-Basque is what I speak’? Were they in Iberia at the time?

  101. David Eddyshaw says:

    @AntC:

    The “probably” (which is admittedly doing a lot of work) is, to be fair, likely not intended to be dependent on the DNA but on extraneous considerations.

    The lesson that language has no necessary connexion at all with racial origin is surprisingly difficult to accept, despite the fact that the language spoken by millions of black Americans and, before that, by most of the population of Britain (who were never massacred by the Saxon, regrettable though he be in other respects) demonstrates the truth of it.

    (Some) languages are contagious. English as smallpox.

  102. Thanks but hmm. Would it be too cynical to suggest “probably” here means “we’re just making stuff up”?

    Is Paleo-Basque/Aquitanian/Iberian/Tartessian just a cover term for ‘whatever was spoken in Iberia before IE got there’? The cave painters in France/Spain are pre the time of Paleo-Basque? What language did they speak? Can we trace DNA influence amongst their consonants?

    Did the Paleo-Basque speakers migrate into the peninsula? Where from? Did they peaceably blend in with the cave painters or massacre them or what?

    Were there humans (Neanderthals?) in the peninsula before the cave-painters? Any traces of their DNA or consonants?

  103. Trond Engen says:

    My quote from Koch’s concluding paragraphs was of course unfair. It’s argued more thoroughly in the paper.

    On the genetic side, this part is the result of Olalde et al:

    a situation in which successive generations of men with Steppe DNA were exceptionally successful in producing offspring with indigenous Iberian women

    On the linguistic side, this part is Koch’s well-argued hypothesis on the origin of Celtic:

    substratum influence from […] a[…] non-IE language or languages with a consonant system similar to that of Palaeo-Basque.

    This part is meant to bridge the gap between the two:

    mothers […] who probably spoke an indigenous non-IE language

    But this is not the only way. Dmitry suggests instead that it’s the incoming fathers who were Paleo-Basque speakers. Celtic would have come in later, with much less population turnover.

    David E.: The lesson that language has no necessary connexion at all with racial origin is surprisingly difficult to accept

    True, but on the other hand, the chance that a child grows up to speak the language of its parents and continue their culture is more than 1 : 6000. And on the third hand, actual abrupt and massive migrations and population turnovers do have linguistic consequences. The millions of black Americans speak neither Niger-Congo nor Muskogean today (exceptions excepted). The wave of massive migration and population turnover in Europe in the 3rd millennium BCE also had linguistic consequences. Some cultural waves with little genetic change probably also had linguistic consequences. We’ll never be able to tell which concequence in every case, and a lot of interesting details and complications will never be recovered, but with archaeo-genetics added to historical linguistics and archeology we’re getting much closer.

  104. a situation in which successive generations of men with Steppe DNA were exceptionally successful in producing offspring with indigenous Iberian women
    Using Occam’s razor one can make something simpler out of this convoluted phrase which is meant to leave in place a hypothesis that these migrants weren’t culturally and societally dominant. Not to mention the fact that if the Steppe DNA itself was a factor of feminine attraction, then the replacement of autosomal DNA would have continued far beyond the roughly 50:50 mix documented in the Iron Age people of Iberia (and in the later-era Basques). Indeed, due to the random nature of DNA inheritance, grandchildren inherit anywhere between 20 and 30% of DNA from each grandparent.

    So even in a hypothetical situation where in generation 1, all fathers were migrant and all mothers are local, the “beneficial” DNA fraction will continue growing in each subsequent generation, because the generation 3 people with a higher fraction due to the chance will have a stronger chance of reproducing, and then in each subsequent generation the variability of the “beneficial DNA fraction” will be even higher and the process will accelerate.

    In a more realistic scenario where a few generation 2 children are fully migrant in origin, and some generation 2 children are fully local in origin, the runaway process of continuous replacement of remaining local DNA would have rolled even faster, due to considerably higher variability of the local DNA content from the outset.

    No, of course it wasn’t the DNA content itself which gave the migrant father a reproductive advantage. And this advantage didn’t seem to have lasted. Having replaced 80% of Y-chromosomes and the corresponding 40% of the autosomal DNA, the process appears to have stopped, without any hypothesized continuous, runaway replacement of the local DNA. I’m sorry but the Occam-compliant explanation would only be a one-time removal of the majority of the local males from the reproducing population, followed by substantially equal reproductive potential of the progeny.

    The lesson that language has no necessary connexion at all with racial origin is surprisingly difficult to accept
    It is hard to accept because it is not true. When two ancestrally. culturally and linguistically different groups meet, then the language outcomes may be different, but mechanisms of change of languages have clear signs of continuity. It is not something haphazard, like, languages stay or change regardless of the outline of situation. The descendants of slaves almost universally lost their languages, regardless of which European or Middle Eastern language their masters imposed. The descendants of Steppe pastoralists, on the contrary, almost always imposed their languages (IE, Turkic) on the subjugated and mixed populations. The numerous examples of the process going the same way give a statistical confidence to a hypothesis that in a similar settings, the process went similarly. Generally, most socially dominant groups and some culturally isolated groups tended to keep their languages.

    The concept of social dominance changed a lot over the millennia, and in the more recent times, languages of government and education impose themselves on the minority languages, especially in urban settings, much more effectively. In the past, when uneducated rural dwelling was more of the norm, resilience of minority languages was greater, but mixed-ancestry couples must have been the most pliable.

    Anyway, way too many words. Just one final example. If a population group moves to a new location, and becomes admixed there, but the descents in both the place origin and the destination speak similar languages … can’t we rule out as outlandish an idea that the language spread from “DNA destination” to the “DNA source”?

  105. It is hard to accept because it is not true.

    Of course it’s true; you’re ignoring the word “necessary.”

  106. Let’s run some hypothetic numbers.

    Region with population of 100 thousand gets invaded by 10 thousand proto-Basques.

    Protos are all male warriors. They win, because they have some technological advantage (horses, for example) and for some reason they decide to stay rather than go back to the steppes where their families are (perhaps due to something similar like what happened to Magyars in 894).

    Now, how can the conquerors who are outnumbered by the conquered by 1:10 ratio impose their language?

    Very simple. First thing they do is to take wives from the conquered population.

    Now, women of childbearing age are typically about 20% of population. So out of total population of 100, 000, there are about 20,000 of women of childbearing age.

    And 10,000 of them, that is exactly half, is taken by proto-Basques. And their children will end speaking proto-Basque language.

    Situation gets even better, because conquerors are likely to take younger (and prettier) women which means their reproductive success will be even greater than 50% (because the other 50% who are still married to the conquered aboriginals are older and hence will have less children).

    So the only thing they need to do is ensure language transmission to their children (typical method is to take boys from their mothers after age of seven and send them to boys camps where some elderly warrior teaches ways of war).

    The conquerors’ language will be spoken by the majority after a generation.

    1:10 ratio can be improved even further, up to even 1:100, perhaps.

    But you need selective mass depopulation for that.

    In Mexico, population drop of 98% after the conquest allowed very few Spaniards to impose Spanish on Indians. Ratio of conquerors to pre-contact native population was close to 1:1000 there.

  107. David Marjanović says:

    Where’s a reference for the “Basque-looking” Sardinian substrate?

    A book by Eduardo Blasco Ferrer that I don’t have access to. I think there was a list on Wikipedia once, but all I can find now is this passage in the current “Sardinian language” article:

    words such as Sardinian ospile “fresh grazing for cattle” and Basque ozpil; Sardinian arrotzeri “vagabond” and Basque arrotz “stranger”; Sardinian golostiu and Basque gorosti “holly”; Gallurese (Corso-Sardinian) zerru “pig” (with z for [dz]) and Basque zerri (with z for [s]).

    That passage mentions Blasco Ferrer and links to this newspaper article in Italian which adds a few more words, a bunch of toponyms, and a few Roman-age inscriptions.

    Also, in central Sardinia (Barbagia < Barbaria), word-initial /r/ is not allowed, so that Latin r- comes out as arr-, orr-, err-; and /f/ is missing. Both traits are shared with Basque, though admittedly that’s unique only on the scale of western Europe.

  108. Dmitry Pruss says:

    ignoring the word “necessary.”
    ah, OK. thank you LH. So in “necessary connection” I should have read “identical connection in all settings”, rather than “highly relevant connection which works differently, if reproducibly, depending on the settings” 🙂

  109. No, “necessary connection” means language is inherently and automatically connected with genetic makeup, an obviously idiotic idea (see David Eddyshaw’s example of the spread of English).

  110. J.W. Brewer says:

    That a child has a given L1 is obviously a result of nurture rather than nature, i.e. it’s not genetically encoded, but the difficulty is of course that nurture and nature tend to be statistically correlated because most (not all, but most) human beings are raised in early childhood by the same adults from whom they got their DNA. Maybe the best way to think about it is that language is “hereditary” only in a Lamarckian rather than Darwinian way — you typically get the same L1 that is your parents’ primary language (especially if shared by both …) as of the time they become your parents, but whether that is the same language that your parents acquired early in their lives from their own parents depends on whether or not there were intervening environmental factors that could have caused language shift earlier in their lives, with such environmental factors being observably frequent enough that knowing who your great-great-grandparents were is, in many parts of the world, often empirically much less predictive of your L1 than of, e.g., your skin color.

  111. J.W. Brewer says:

    It seems that in fairly modern times there are at least four different common scenarios for language shift such that inherited language will deviate from inherited DNA: 1) A-speakers immigrate to a new location which is dominated by B-speakers and are in the new B-speaking context comparatively low-status or at least not as high status as “conquerors” would be; 2) A-speakers conquer a B-speaking location but initially form a small ruling elite which is numerically outnumbered by the B-speakers (making linguistic assimilation of the conquerors at least a reasonable and historically-attested possibility if not certainty); 3) A-speakers conquer a B-speaking location (or move to a previously-conquered such location under the auspices of the conquerors) with such overwhelming demographic superiority that B-speakers become a minority in the territory they had once dominated; and 4) “nation-building” or “modernization” dynamics especially but not only in a post-colonial situation encourages what had once been a common L2 (often but not always the language of a former imperial power or the regional language of the most powerful-prestigious group within a multilingual polity) to start displacing long-established regional L1’s. How these language shifts will match up with genetic history will probably be different in each genre, and in each genre there is also the wild card that it can occur at least in the early stages either with or without substantial intermarriage or other genetic exchange between members of the previously linguistically-distinct groups.

  112. David Marjanović: the case for a Sardinian substrate genetically related to Basque is quite weak, as was pointed out in a review of Blasco-Ferrer’s work by H.J. Wolf in 2010 (In the Revue de linguistique romane, issue 75, 595-615). The shared phonological features you list, incidentally, are shared with Iberian, which suggests at best that there were several phonologically similar languages in Sardinia and the Iberian peninsula at the time of the Roman conquest, whose phonological similarity may have been a purely areal feature (a point L. Mitxelena first made about Basque and Iberian, I believe). If I recall correctly Wolf also points to a number of very un-Basque-like and un-Iberian-like phonological features of the Sardinian substrate (the presence of an /m/ phoneme, for instance).

    All: One thing which must be stressed about Basque and Celtic is the fact that while Basque contains a massive borrowed Latin/Romance element, it also contains very few if any loanwords from any other Indo-European language, including Celtic. Again, this was something Mitxelena pointed out and which the late Larry Trask contrasted with the many layers of (abundant) Indo-European loanwords in various Uralic languages. And is a major reason why I have trouble with theories claiming that either Celtic or Basque was ever extensively used as a lingua franca along the European Atlantic coast, incidentally.

  113. Dmitry Pruss says:

    inherently and automatically connected
    in math I was taught that this is “necessary and sufficient” rather than just “necessary”? (Necessary conditions there aren’t by themselves automatically leading to the consequences). Not picking any fights here, just trying to explain a gap in my English 🙂

    very few if any loanwords from any other Indo-European language, including Celtic
    presumably despite centuries of living nearby on the left bank of the Ebro. Very interesting. How trustworthy is Koch’s insistence that influences flowed in the opposite direction, from Basque to Celtic?

  114. Minor correction to my comment today: H. J. Wolf’s review was published in 2011, not 2010.

    Dmitry Pruss: I don’t regard it as especially credible: the number of Basque-like non-Indo-European words in Celtic is so modest that to my mind it might just be a number of chance similarities. Beyond vocabulary, and looking at grammar and phonology, in many ways Celtic differs from Basque *more* than other Indo-European languages do. For instance, Basque is an ergative verb-final language, unlike Old Irish or Middle Welsh (both verb-initial, nominative-accusative languages) but very much like (say) Hindi or Pashto; Basque has five vowel phonemes (a,e,i,o,u) unlike any known or reconstructed Celtic language but very much like (say) Modern Greek.

    If you look at Celtic and Basque diachronically things do not get better. For instance Basque wholly lacks grammatical gender involving nouns: all known Celtic languages have preserved it, with Old Irish being quite faithful to the three-gender system (masculine-feminine-neuter) of Late Indo-European. From this point of view Basque, typologically, is much more (say) Armenian-like than Celtic-like.

  115. David Marjanović says:

    I’ve finally read the paper! It’s quite readable. Fig. 1E is a good summary, showing two influxes of steppe-related ancestry, one at the beginning of the Bronze Age from “Central European populations”, one at its end “from Central/Northern Europe”. Both of these are present in the Basque region, which is distinguished from the rest of the Iberian peninsula by lacking the “[a]ncestry related to central/eastern Mediterranean populations” introduced by/as the Romans.

    I am tempted to equate the first with the Lusitanian language (and with “Sorothaptic”, the hypothetical IE language of the Urnfield culture or at least its Iberian part) and the second with Celtic. The second is young enough that Proto-Celtic might even be equated with the La Tène culture as it traditionally was.

    Important quotes from the paper:

    From the Bronze Age (~2200–900 BCE), we increase the available dataset (6, 7, 17) from 7 to 60 individuals and show how ancestry from the Pontic-Caspian steppe (Steppe ancestry) appeared throughout Iberia in this period (Fig. 1, C and D), albeit with less impact in the south (table S13). The earliest evidence is in 14 individuals dated to ~2500–2000 BCE who coexisted with local people without Steppe ancestry (Fig. 2B). These groups lived in close proximity and admixed to form the Bronze Age population after 2000 BCE with ~40% ancestry from incoming groups (Fig. 2B and fig. S6). Y-chromosome turnover was even more pronounced (Fig. 2B), as the lineages common in Copper Age Iberia (I2, G2, and H) were almost completely replaced by one lineage, R1b-M269. These patterns point to a higher contribution of incoming males than females, also supported by a lower proportion of nonlocal ancestryon the X-chromosome (table S14 and fig. S7), a paradigm that can be exemplified by a Bronze Age tomb from Castillejo del Bonete containing a male with Steppe ancestry and a female with ancestry similar to Copper Age Iberians. Although ancient DNA can document that sex-biased admixture occurred, archaeological and anthropological research will be needed to understand the processes that generated it.

    For the Iron Age, we document a consistent trend of increased ancestry related to Northern and Central European populations with respect to the preceding Bronze Age (Figs. 1, C and D, and 2B). The increase was 10 to 19% (95% confidence intervals given here and in the percentages that follow) in 15 individuals along the Mediterranean coast where non-Indo-European Iberian languages were spoken; 11 to 31% in two individuals at the Tartessian site of La Angorrilla in the southwest with uncertain language attribution; and 28 to 43% in three individuals at La Hoya in the north where Indo-European Celtiberian languages were likely spoken (fig. S6 and tables S11 and S12). This trend documents gene flow into Iberia during the Late Bronze Age or Early Iron Age, possibly associated with the introduction of the Urnfield tradition (18). Unlike in Central or Northern Europe, where Steppe ancestry likely marked the introduction of Indo-European languages (12), our results indicate that, in Iberia, increases in Steppe ancestry were not always accompanied by switches to Indo-European languages. This is consistent with the genetic profile of present-day Basques who speak the only non-Indo-European language in Western Europe but overlap genetically with Iron Age populations (Fig. 1D) showing substantial levels of Steppe ancestry.

    La Hoya, BTW, is at the southern fringe of today’s Basque Country, near Araba/Álava.

    Before this work, it was known that the lactase persistence allele at rs4988235, which is present at moderate or high frequencies in most European populations today and is one of the strongest known signals of selection in Europeans (26), occurred at extremely low frequencies in Europe through the Bronze Age (2), raising the question of when it became common. Here we show that in Iberia, the allele continued
    to occur at low frequency in the Iron Age (fig. S9) and only approached present-day frequencies in the past 2000 years, pointing to recent strong selection.

    P. 60 of the supp. inf. has more detail on the Y haplogroups and mentions that they show the Iberian Bronze-Age males cannot have come from Britain as some had apparently surmised.

    The admixture modeling starts on p. 63, the part on the Bronze and Iron Ages is pp. 65–68, plus fig. S6 on p. 81. Please check it out.

    P. 74:

    In Fig. S9, we show derived allele frequency estimates for four SNPs with functional importance: SNP rs4988235 in LCT responsible for lactase persistence, SNP rs12913832 in HERC2/OCA2 responsible for blue eyes, and SNPs rs16891982 and rs1426654 in SLC45A2 and SLC24A5, respectively, associated with reduced skin pigmentation in Europeans. A striking observation is the complete absence of the lactase persistence allele in Iberia (present at 0.46 frequency in present-day Iberians) until recent historical times, which suggests very recent selection.

    Then comes the undated pre-Neolithic skeleton from Carigüela, which can be dated to the Mesolithic by the amount of Neandertal DNA in its genome (2.22%). “It has been previously shown (4) that Neanderthal ancestry has steadily decreased during the last 45,000 years.”

    Fig. S9 is on p. 84.

  116. Trond Engen says:

    Thanks, Étienne. Koch is working with Cunliffe’s hypothesis of Celtic as the language of the Atlantic Bronze Age. I was going to suggest that the Vasconoid substrate could be from the Beakers in France before Urnfield. But with the substrate out of the equation that’s unnecessary.

    David M.: I’ve finally read the paper! It’s quite readable. Fig. 1E is a good summary, showing two influxes of steppe-related ancestry, one at the beginning of the Bronze Age from “Central European populations”, one at its end “from Central/Northern Europe”. Both of these are present in the Basque region, which is distinguished from the rest of the Iberian peninsula by lacking the “[a]ncestry related to central/eastern Mediterranean populations” introduced by/as the Romans.

    I am tempted to equate the first with the Lusitanian language (and with “Sorothaptic”, the hypothetical IE language of the Urnfield culture or at least its Iberian part) and the second with Celtic. The second is young enough that Proto-Celtic might even be equated with the La Tène culture as it traditionally was.

    Fig. 1 E is a brilliant summary. I can’t believe I haven’t seen a bar like that before. If I have a nitpick, it’s that the bar coming from the left should be thinner before the tributaries are added. (One might even go full Napoleon and scale the bar according to total population. But that’s not a nitpick, it would take good population estimates throughout the period.)

    The authors seem to say that the later influx, around 1000 BCE, is the introduction of Urnfield. And I think I agree. Urnfield as traditionally defined arose in Southern Germany and Bohemia around 1300 BCE. Spreading into the Mediterranean peninsulas makes it the probable nursery also of Italic and maybe lllyrian, so Celtic formed only in a part of it — presumably the western part if the “Celtic” burials arrived in the Iberian Peninsula as early as the 8th century BCE. And that’s definitely too early for La Tène.

    This goes into the nature of the iterations of culture coming out of a core region around the Upper Danube. Possibly Bell Beaker, certainly Urnfield, Hallstadt, and La Tène. Each new culture more or less replaced the former throughout the same region in Central and Southern Europe, and I may well imagine that each came with a new koiné. It reminds me of the dynasties of China, now that I think of it.

    Anyway, whatever ran over the Peninsula in the late 3rd millennium, it wasn’t Celtic. I don’t know if it was Vasconian, but it looks significant that what separates the Basque regions from the rest genetically, is their isolation during and after the Roman era, i.e. they are Basque because they escaped latinization, not the other way around.

  117. John Cowan says:

    The descendants of Steppe pastoralists, on the contrary, almost always imposed their languages (IE, Turkic) on the subjugated and mixed populations.

    Definitely not true of Mongols (neither westbound or eastbound) nor of Manchus. In the first case, nothing; in the second, minimal spread probably attributable to garrisons; in the third case, one garrison (the Xibe) plus essentially complete loss in the original homeland. So the cases we know best are the cases that don’t work.

  118. David Marjanović says:

    Back from March 19th:

    (an added bonus: Carthaginian colonies near today’s Barcelona seem to be genetically similar to the Mycenaeans)

    No, that’s the Greek colony of Emporion, modern Empúries.

    March 30th:

    The cave painters in France/Spain are pre the time of Paleo-Basque? What language did they speak?

    Cro-Magnon is some 28,000 years old, placing it before the Last Glacial Maximum, when the permafrost reached southern Hungary and there don’t seem to have been humans north of the Alps; Altamira dates from its aftermath. From the timespan covered by Altamira comes the skeleton from El Mirón, also in northern Spain. Olalde et al. (2019) make a point of the fact that its genome is notably different from that of the Western Hunter-Gatherers found in the Iberian Mesolithic (like the two brothers from La Braña), though they find other Iberian Mesolithic genomes intermediate between the two, indicating interbreeding. It has long* been thought that the Eastern-Baltic-Scandinavian-Western Hunter-Gatherer continuum formed by post-LGM immigration of Ancient North Eurasians from Siberia or thereabouts; when they reached Spain, they evidently found descendants of the people who had survived the LGM in the Iberian refuge – and in the end swamped them genetically.

    That’s the first population influx in the hypothesis presented by Olalde et al. (2019). It is followed by the Early European Farmers from Anatolia, then by the two peoples with steppe ancestry. Most likely, then, the languages of the cave painters had undergone several replacements in a row before writing was first introduced in the Iberian peninsula.

    In short, they probably spoke something quite unlike anything known today, and we have no clue what it was like.

    * For a few years.

    The shared phonological features you list, incidentally, are shared with Iberian, which suggests at best that there were several phonologically similar languages in Sardinia and the Iberian peninsula at the time of the Roman conquest, whose phonological similarity may have been a purely areal feature (a point L. Mitxelena first made about Basque and Iberian, I believe). If I recall correctly Wolf also points to a number of very un-Basque-like and un-Iberian-like phonological features of the Sardinian substrate (the presence of an /m/ phoneme, for instance).

    Sure. Of course I would expect Basque and Iberian to be distantly related, and by geography I’d expect them to be less distantly related to each other than to Ante-Sardinian; it’s too bad the Iberian inscriptions are so limited and so poorly understood (though they do seem to contain a complete set of strikingly Basque-like numerals, as discussed at some length on Wikipedia). A further complication is that we don’t know if Iberian was in contact with Pre-Basque.

    while Basque contains a massive borrowed Latin/Romance element, it also contains very few if any loanwords from any other Indo-European language, including Celtic.

    True. A few have been identified, and a few more may lurk behind our incomplete knowledge of Gaulish and Celtiberian, but the number is clearly not large, much smaller than the Classical Latin layer alone.

    it looks significant that what separates the Basque regions from the rest genetically, is their isolation during and after the Roman era, i.e. they are Basque because they escaped latinization, not the other way around.

    The Basque region was also exempt from the Atlantic Bronze Age. But linguistically, the keyword is “Aquitanian”: Pre-Basque extended far north of today’s Basque-speaking region, and not as far south or west.

  119. Trond Engen says:

    David M.: No, that’s the Greek colony of Emporion, modern Empúries.

    Yes, I meant to mention that and wish for a similar sampling of Carthaginian colonists. They do say that

    In the southeast, we recovered genomic data from 45 individuals dated between the 3rd and 16th centuries CE. All analyzed individuals fell outside the genetic variation of preceding Iberian Iron Age populations (Fig. 1, C and D, and fig. S3) and harbored ancestry from both Southern European and North African populations (Fig. 2D), as well as additional Levantine-related ancestry that could potentially reflect ancestry from Jewish groups. These results demonstrate that by the Roman period, southern Iberia had experienced a major influx of North African ancestry, probably related to the well-known mobility patterns during the Roman Empire or to the earlier Phoenician-Punic pres-ence; the latter is also supported by the observation of the Phoenician-associated Y-chromosome J2.

  120. Trond Engen says:

    Dmitry Pruss: And it turns out to be a wider Celtic rite, not just Celtiberian. La Tene period Celtic dwellings in Austria also have indoor infant burials, and across much of Europe, cremated remains of younger children are conspicuously lacking, too

    I just became aware of Marianne Hem Eriksen (2017): Don’t all mothers love their children? Deposited infants as animate objects in the Scandinavian Iron Age (link to the final draft on the page), which tells of ritual children’s burials in wetlands and inside buildings throughout the Germanic world from the last centuries BCE until well into the 2nd millenium CE.

Speak Your Mind

*