THOSE DARN BIOLOGISTS AGAIN.

I started off this post: “The NY Times has another language story […] and if you’re an aficionado of these things you will have guessed that 1) the story is by the muddled but ever plucky Nicholas Wade…” Well, The NY Times has another language story, “Family Tree of Languages Has Roots in Anatolia, Biologists Say,” and once again it’s by the muddled but ever plucky Nicholas Wade. Fortunately, all you really need to read is the headline. Biologists have no business pronouncing on historical linguistics. And yet they keep doing it!
Caveat. The above grumbling is based on my longstanding annoyance with non-linguists thinking they can do better than linguists at linguistics (and with bad reporting, of course); it has nothing to do with this specific paper, some of whose writers do have linguistic training, and should not be taken as a reflection on the authors. I intend to read the paper, but have not yet done so.
Update. There is a good discussion of this going on at the Log; I agree with the serious doubts expressed by many of the commenters there.

Comments

  1. Bruno van Wayenburg says

    I’m an avid reader, great admirer, and pretty infrequent commenter of LH, and Nicholas Wade’s story may be muddled as many other press stories will be, but your last pronouncement strikes me quite parochial.
    Why can’t biologists use their insights and tools to help solve and bring new viewpoints to an old and great question -even in principle, aside from any discussion about the merits of those tools?

  2. Trond Engen says

    Not much wrong with David Marjanovic, anyway.
    Stated like that it does lack nuance. But the context is Hat being fed up with a steady stream of people with no linguistic expertice making basic errors when pronouncing on historical linguistics. The point is rather that an evolutionary biologist wanting to make a serious contribution would work with historical linguists and take the trouble to learn the methodology. That would give a headline along the lines of “…, New Cross-Discipline Study Says”.
    It’s not much to say about the validity of the input or the methods based on this article, but not taking into account the known early contact with Uralic is a basic error.

  3. In theory they could, of course, if they took the trouble to immerse themselves sufficiently in the methods of the field and the data already established by linguists. In practice, they just assume they don’t need to because they’re real scientists and know better.
    Yes, I’m being a tad unfair, and doubtless parochial, but I don’t think biologists would react with any better grace if linguists decided to correct and improve the findings of biology.

  4. What a silly thing to say. All the good work in science these days is done at the intersection of disciplines. Banning other people from studying linguistics loses all of that.
    What does it take to be a linguist? There’s one card-carrying trained descriptive linguist on the team (Michael Dunn). I’m employed by the sixth or seventh best linguistics department in the world (at ANU). Others on our team (Russell Gray, Quentin Atkinson & me) have been working on languages for about a decade each.
    Don’t let how the media have sold the paper taint your view – read the original article eh?
    Simon

  5. OK, I am suitably chastened. Like Trond says, I’m fed up with past failures, but I shouldn’t assume no one can do better. I’ll read the paper.

  6. Trond Engen says

    Now it’s getting interesting! Is the paper available anywhere?

  7. Trond Engen says

    And by available, I mean free.

  8. What a silly thing to say. All the good work in science these days is done at the intersection of disciplines.
    That’s a big overstatement, of course, and therefore silly in its own way.

  9. marie-lucie says

    Not much wrong with David Marjanovic, anyway.
    David Marjanovic (where is he now?) is a professional biologist but he is extremely knowledgeable about historical linguistics (like a few people here even though they are not professional linguists).
    Unfortunately, many descriptive linguists (and even more Chomsky disciples) have little acquaintance with historical linguistics. (This is a general comment: I have not read the article yet).

  10. SG: What is your position regarding the wheel problem? I have not read this new article yet, but have not seen any serious rejoinder to this in older publications.
    (The ‘wheel’ problem is this: one can construct a PIE word for ‘wheel’. Wheeled transport appears in the archaeological record about 4000 BC, consistent with the traditional date. If PIE is 3000 years older, as Atkinson et al. argue, why is there no archaeological trace of wheeled transport for these 3000 years? Or, if you think one can’t reconstruct PIE ‘wheel’, what is the linguistic counter-argument?)
    And by the way, I think the arguments about work being done by ‘non-professional’ vs. ‘professional’ historical linguists is kind of silly. Let’s just argue about the content of the papers and avoid the much harder problem of what is going on in the authors’ minds.

  11. I don’t think it’s great strategy to try to keep non-linguists away from theorizing about language. It’s a great advantage for linguistics that everyone knows about language and practically everyone is interested in theorizing about it.
    Judging from the Times article, though, it does seem that Dr. Atkinson’s views are excessively contaminated by tree model thinking. If he ever got around to taking a first course in historical linguistics, he’d surely hear about that newfangled wave model. If a modern biologist can’t get us beyond Linnaean hierarchies, what can he get us beyond?

  12. Full text. I’ve read the paper (it’s quite short), but not with a detailed and skeptical eye.
    I added a few comments to the NYT article correcting specific mistakes in other comments.
    One idea appeared in the comments that 4I haven’t heard before: that the origin point of the IE-speakers was in the Black Sea basin, back when it was a smaller fresh-water lake before the most recent inundation from the Med about 7600 years ago. There would, of course, be no archaeological evidence of this, but it would account for dispersion to both Anatolia and the steppe more or less simultaneously.

  13. J.W. Brewer says

    JC: so would this mean that the wheel could have been invented in the PIE Urheimat much earlier than supposed but all the evidence thereof is now conveniently lost beneath the waves?

  14. Full text.
    “Subscribe/Join AAAS or Buy Access to This Article to View Full Text.” Oh well! Here’s hoping Simon comes back with a link for all to enjoy, because given his vigorous defense I am quite curious to see what it looked like before the media taint.

  15. J.W. Brewer says

    Refreshing my recollection from the wikiarticle on glottochronology, I see that in an earlier paper Gray & Atkinson are said to have denied that their approach is anything like glottochronology (with its dubious-sounding-to-me carbon-14-type analogies about a constant rate of vocabulary turnover). But then what does this paper do? How is the model constructed w/o plugging in a bunch of unprovable guesses about the rate of vocabulary change and the direction and speed of human migration, and why wouldn’t tweaking those inputs within a range that was still not provably wrong give you a different output?

  16. marie-lucie says

    The problem with glottochronology is the assumption of a constant rate of vocabulary change, a concept borrowed from the rate of carbon-14 decay. There is absolutely no reason why the rate of decay in the remains of dead plants or animals (something discovered through very careful measurements of a physical process, conducted and repeated by a number of scientists) should have anything to do as a concept with the rate of attested vocabulary loss and replacement in human languages: nice guess perhaps, worthy of some consideration, but when put to the test by actually studying vocabulary change in long-attested languages, it failed, sometimes quite spectacularly: compare the rate of replacement of English vocabulary between Old and Middle English with that of German or Swedish within a similar period.
    Any human language is intimately bound up with the life of the society that speaks it, and vocabulary is especially reflective of this life: new inventions, new social mores, abandonment of traditional techniques and of social customs, migration in or out of a country, addition of new immigrants (free or slaves), serious upheavals such as revolution, war, foreign occupation, etc. Such events as contributors to vocabulary change (and sometimes also to other types of change) have no equivalents in the physical changes that happen to organisms after they have ceased to live.

  17. JWB: Even supposing the ur-wheels are at the bottom of the Black Sea, Atkinson et al.’s published trees show Hittite, Greek-Armenian and Tocharian separate from the rest by already 5300 BC, so you’d expect there’d be some pre- 4000 BC chariots lying around elsewhere.
    JWB and m.-l.: Gray, Atkinson et al.’s approach to dating is better than the naive glottochronology of the 1960s. Their method allows for variabilty, and attempts to rigorously calculate a range of probable dates based on all available data. Just because it’s statistics doesn’t mean it’s right, of course, but at least it’s a step forward.

  18. marie-lucie says

    Does anyone have access to the article in a way that would allow them to reformat it to make it legible for those who don’t subscribe to Science? Thus far we have to take YM’s word for the improvement in methods, etc.
    As an example, “all available data” does not mean “all the relevant data”, or “all the available data presented in the most relevant manner”.

  19. I haven’t seen this new paper yet. I only saw their older papers, some of which are at Russsell Gray’s website.

  20. Simon Greenhill: I did not read your article. And I have no intention of ever reading it, and I would recommend that nobody bother doing so.
    This is because the description in the NY TIMES article is quite enough to show that you are too ignorant of historical linguistics for anything you write on the topic to be worth my time, or indeed anyone’s.
    I refer to the following methodological point: the separation of Romanian from the remaining Romance languages as having taken place in 270 AD, when Dacia ceased to be part of the Roman Empire. You use this as an established datum in order to evaluate the age of Indo-European.
    The problem is that this is NOT an established datum. The issue as to whether the Romanian language directly stems from the Latin of the Roman province of Dacia or from the Latin of a later group of migrants (whose point of origin must have been South of the Danube) has been fiercely debated over several generations. If the latter scenario (a later migration to Dacia) is true, then of course the date of separation between Romanian and its Romance sisters must be later, indeed perhaps MUCH later, than 270 AD.
    Computers are growing ever-more sophisticated, but the “GARBAGE IN, GARBAGE OUT” principle still applies.
    Incidentally, a majority of scholars, including myself, believe that the migration scenario is the likelier one (I would be more than happy to supply readers with relevant scholarly references, should anyone be interested).
    This controversy, by the way, is one which is faithfully presented in quite a number of elementary introductory textbooks on Romance historical linguistics.
    You boast that you are employed by one of the best linguistics department in the world, that you’ve a descriptive linguist on your team, and that others have worked on linguistic issues for over a decade.
    Frankly, it reflects poorly on you, your department and your colleagues that despite all these impressive credentials you have committed a blunder (which casts serious doubt on your results) which could have been rectified had any of you bothered to consult an undergraduate textbook.
    I have no objections to biologists or other outsiders doing work on linguistics. I am certain that, in like fashion, you would not object to linguists or other outsiders doing work on biology.
    If, however, such outsiders produced “work” on biology whose foundations contained fatal flaws which could have been corrected had an undergraduate textbook on biology been consulted, I for one would certainly not expect you or any other biologist to bother reading such “work”.
    You and your colleagues will therefore, I trust, not take it amiss if in like fashion and for the same reason I choose not to bother reading yours.

  21. I’m no linguist, but it seems to me that the “names for parts of a chariot” argument is a bit dicey. When a culture adopts a new technology, they frequently adopt the terminology of that technology as well. I would expect this to be more likely if the technology came from a related (IE-descended) culture.
    Ireland has a lot of mythology about people fighting on chariots, but some historians claim that chariots were not used at all in Ireland during the period in question (a few centuries BC), although they were used by the continental Celts.
    I’m sure Etienne knows more about Romanian than I do. But I will add this minor point that the Romans occupied England a lot longer than they occupied Dacia, yet all that is left in England is a few placenames, whereas in Dacia they created the national language? If nothing else this illustrates that language does not change at a linear rate.
    I have J.P. Mallory’s In Search of the Indo-Europeans (1989). What do the experts think of that book these days?

  22. Bruno van Wayenburg says

    I have access to the paper, and I’ll send it to commenters here who drop me a line on brunchik@gmail.com.

  23. Trond Engen says
  24. Trond Engen says

    Sorry, here‘s the link.

  25. It’s a great advantage for linguistics that everyone knows about language and practically everyone is interested in theorizing about it.
    What?! Everyone thinks they know about language, but only people who have taken a linguistics course or read and truly understood a book by a linguist actually do know, and we’re talking about a tiny minority here. The second part of your sentence is, unfortunately, true.

  26. Maidhc, the whole question is whether the Roman occupation of Dacia created the national language. My impression is that the affirmatve is an article of faith in Romania but is now a distinctly minority opinion elsewhere.

  27. it seems to me that the “names for parts of a chariot” argument is a bit dicey. When a culture adopts a new technology, they frequently adopt the terminology of that technology as well
    That assumes that there was reasonably intimate contact, which would be difficult if the Indo-Europeans were already spread over many thousands of kilometres. One would think that there was at least a certain amount of contact between the British and Americans in the 19th-20th centuries, and yet they managed to come up with quite big differences in vocabulary for railways and motor cars (or railroads and automobiles), for instance. How much more difficult would it be to get uniformity in ancient times when the spread of technologies and vocabulary was presumably far slower. (I’m not a historical linguist either).

  28. Dunno if I fully qualify as a darned biologist but I dabble in evolutionary genomics every now and then, and there is a peculiar complicating phenomenon emerging now in genetic chronologies (usually making the resulting trees artifactually more ancient).
    They are just now starting to see population admixtures, especially admixtures with the extinct groups. You see, just like languages, the genes change not just by piecemeal mutation, but also by borrowing from the neighbors. Even a small fraction of Neanderthal admixture would appear as a whopping amount of mutations to a classic tree-builder, thus making the roots of this hypothetical trees extremely old.
    But of course it wasn’t quite a tree with one root and one trunk in the first place; other trunks contributed, and branches rejoined too.
    What makes a differences in genomic analysis now are new techniques (which can recognize fused branches) and new data (on ancient genomes). Linguistic is already far ahead in terms of understanding the old languages (although not the really ancient, pre-literate ones), and it’s already quite capable of recognizing past admixtures / spurts of borrowing (although not in a really quantitative fashion, of course). So it might be possible to correct some fallacies of “plain-tree” glottochronologies in the future. But in the meantime, just keep in mind that whenever we suspect a more active linguistic admixture process in the past, we should also suspect an exaggerated age of the tree root.

  29. maidhc’s question is discussed at this Language Log post.

  30. marie-lucie says

    maidhc: the Romans occupied England a lot longer than they occupied Dacia, yet all that is left in England is a few placenames, whereas in Dacia they created the national language?
    The two situations are quite different.
    In England the Romans were an occupying army, along with some traders, in an island far from the capital of the empire. When the armies left, they were promptly replaced by Germanic peoples who made themselves the masters of most of the island. The Latin language did not survive there, any more than the Celtic language of the population of what is now England: both languages were replaced by the Germanic languages of the newcomers.
    On the other hand, during the time that Dacia was part of the Roman Empire, many veterans of the Roman armies were given lands there to settle on. However, settling a mostly male population in an already populated country is not a recipe for preserving the language of these newcomers. An analogous situation occurred in Normandy, where Vikings were allowed to settle (hence the name of the place), but since most of them had come as single men and found wives among the local women, their children were probably bilingual but after a century or so their descendants ended up speaking the local language (“Norman” French) exclusively. That is why these descendants of a Scandinavian-French genetic mix brought French rather than Norse to England in 1066 and afterwards. (But words of Norse origin had been adopted in Norman French and some of them later became part of Standard French).
    If the Roman veterans had been the only Latin speakers in Dacia, Latin there would probably not have survived beyond a couple a generations except in a few loanwords in the local language. That is (at least one reason) why the hypothesis of a later migration seems more plausible.

  31. “British and Americans … managed to come up with quite big differences in vocabulary for railways and motor cars”: for motor cars that’s not too surprising, is it? They were developed mainly in Germany and France so the British and Americans each developed their own English for them.
    Bur railways were developed in Britain, so it’s odd that the Americans didn’t take copy the terminology when they copied the technology.

  32. Come to think of it, does British terminology about aeroplanes differ much from American? It would seem unfair to the Wright bros if it does.

  33. does British terminology about aeroplanes differ much from American?
    Airplanes.
    Does anyone know why, when planes and cars were being invented, the designers came up with exotic-sounding new names like carburetter and aileron, whereas personal computer technology uses user-friendly descriptive metaphors for its names – search engines, files, software etc.?

  34. “The ‘wheel’ problem is this: one can construct a PIE word for ‘wheel’. Wheeled transport appears in the archaeological record about 4000 BC, consistent with the traditional date. If PIE is 3000 years older, as Atkinson et al. argue, why is there no archaeological trace of wheeled transport for these 3000 years?”
    Weird illogic, YM. How well would an analogous like this convince you:
    One can construct English vocabulary for computers for the late 29th century. If English is 1,500 years older, why is there no archeological evidence for computers for those 1,500 years?
    The only claim you can make from shared vocabulary is that the language was unitary at the time the vocaubulary came into use. In other words it says nothing about the history of the language prior to that.
    And even that claim is not absolute. Languages that have separated can still share vocabulary because of borrowing, especially of wander words like technological terms , such as…..computers and software. English and High German and Swedish, which clearly have a common ancestor, undisputed linguistic unity; separated over well over 1,500 years ago – want to guess what the words for ‘computer’ and ‘software’ are in those langauges? And this is not a one-off; the same thing may very well have happened in Muskogean langauges with the word for corn.
    These are very basic principles of historical linguistics, and no, you don’t have to be a professional linguist to make these observations – I am a retired army officer and I work in law enforcement now – but you do have to bother to learn them and you do have to make the effort to apply them.
    Someone upthread mentioned that the article’s conclusions were based on an exaggerated reliance on a tree model, with a bitchily ironic to a “new-fangled” wave model (It was proposed in the late 1800s.) Again, let’s turn this around to look at a linguist’s equally clueless application of linguistic principles to biological evolution and claiming that humans’ reaaltive hairlessness resulted from an episode of their ancestors having sex with hippos (lexical borrowing).
    Dmitry’s comment is apposite too. It will be the first time that linguistics has influenced the biological sciences. Grimm’s Law was a important boost to the then new science of paleontology.

  35. “does British terminology about aeroplanes differ much from American?
    Airplanes.”
    Yes, AJP, it was a joke. But it’s considered vulgar to boast that you saw it.

  36. Dmitry writes: You see, just like languages, the genes change not just by piecemeal mutation, but also by borrowing from the neighbors.
    By borrowing what? I suppose you mean borrowing words, but here’s where historical linguists and biologists seem to be on different tacks. For a linguist, the evolution of languages is mostly a story about sound changes, not about words. If enough detail is preserved, borrowed words can be distinguished, and sometimes one can tell when they were borrowed by what sound changes of the borrowing language they reflect.

  37. Would it be vulgar to point out that carburetters are carburetors in the US?

  38. Greg, the Science paper we’re discussing didn’t dwell on phonetic changes. They used a binary change metric (preserved vs. lost cognates) and thus all the changing sounds within cognates would have been tallied as “no change” for the purpose of their study.

  39. Can this all be tested (or has it been tested) on a language break-up where there is a continuous record? For example, could the researchers plug in the romance languages and see if their model reproduces the roman empire? I assume there is enough of a record of that entire period that their assumptions and results could be checked, but it’s hard to come up with a wrong answer projecting an untested model onto the unknown.

  40. “Would it be vulgar to point out that carburetters are carburetors in the US?”
    Nope: interesting, I’d say; certainly not vulgar.
    How were they spelled in French/German/Italian or whatsoever tongue they originated in? I’m hoping that it can be argued that if some new word arrives from”foreign” the British automatically give it what they imagine to be a French-like spelling and the Americans an Italian-like. But I don’t suppose it’s so.

  41. Personally (though I’d guess this informs others’ reactions as well) a major element of the aggravation such articles cause me lies in the selectivity of journalists: they’ll report on language when its done by biologists, computer scientists, or neurologists, but rarely if ever provide any reportage on the activities of professional linguists. Unless it’s about an endagered language, though even then it’s not certain.
    So yes, I do find it eye-rolling to open Science (or a similar journal) and find people from the biological or physical sciences re-inventing the linguistic wheel again… squarely. But that’s the choice of those journals, which don’t make any pretence at being linguistics journals. What makes me a little bezonkers is that these are pretty consistently the _only_ academic publications about linguistics that will get picked up by the general/lay media.

  42. “By borrowing what? I suppose you mean borrowing words, but here’s where historical linguists and biologists seem to be on different tacks. For a linguist, the evolution of languages is mostly a story about sound changes, not about words.”
    No, Greg. Historical linguisitics mostly concerns itself with words and grammatical changes. Sound correspondences between languages are all derived form patterns in lists of words that have been judged to be related, as in descending from the same etyma. Sound changes are only part of the evidence for that, because the semantics of words change and have to be analyzed, if only to determine if two forms that look alike really are the same etymon.
    “If enough detail is preserved, borrowed words can be distinguished, and sometimes one can tell when they were borrowed by what sound changes of the borrowing language they reflect.”
    That is exactly right. It’s a big if, but if that information is there, you can not only often (!) discern loanwords, the directionality of the borrowing (which language donated and which borrowed), when the word was borrowed (because once they have been borrowed they will participate in later sound changes, which can be dated)but also which langauge at what time was socially dominant of prestigious.
    Example of dating of loanwords – Hungariana has tow sets of loanwords from Turkish or some very closely related language and the sets are separated by something like a thousand years, and this is clear because of the way sound changes affected these sets differently.
    Example of changes in dominance – Irish has loaned a certain number of words from Englsih, in the last couple of centuries, but previous to that English was an immigrant minority langauge in Ireland and the elites spoke Irish, not English. English borrowed words like’fond’ and ‘bother’. (And very probably ‘girl’ the rest of yiz, that’s right about the time English was losing it’s ‘x’ phoneme.)

  43. mollymooly says

    It seems the vaguer a journal’s title, the more prestigious it is. So if your paper is rejected by “Science”, try “Organic Chemistry”, or “Journal of Ethylating Agents”, or “American Journal of Ethylating Agents”, or “Milwaukee Journal of Ethylating Agents”.
    But what about interdisciplinary papers? Is the “Journal of Linguistic Archaeology” less prestigious than the “Journal of Linguistics” or the “Journal of Archaeology”?

  44. Putting dates on borrowed words by analyzing patterns of phonetic changes is a powerful approach, but doesn’t it loose much of its power when the events date too far back to the distant preliterate past? Is it really possible to use this kind of evidence to rule out borrowing of “chariot terminology” @ 5 kya? I could imagine that most of what we’d able to infer about phonetic shifts would postdate the hypothetical borrowing by a great margin?

  45. Personally (though I’d guess this informs others’ reactions as well) a major element of the aggravation such articles cause me lies in the selectivity of journalists: they’ll report on language when its done by biologists, computer scientists, or neurologists, but rarely if ever provide any reportage on the activities of professional linguists.
    Yes! I don’t think the point had occurred to me in that form, but now that you mention it, it makes me a little bezonkers too.

  46. marie-lucie says

    the spread of technical words, or not
    Example 1:
    English: computer
    Spanish: computador(a), ordenador
    French: ordinateur
    Swedish: dator
    These are the ones I know offhand, without searching through dictionaries. None of them is an approximation of the pronunciation of the English word. Astonishingly, the Swedish word is not quite the same as the Norwegian one.
    Example 2:
    A few years ago the French linguist Henriette Walter, who has written several books for a general French-speaking audience, commented on the fact that a technical innovation can spread rapidly but the original word designating it does not necessarily get adopted along with it. Her example was the word for ‘match’ (for lighting fire), an invention from the 19th century. She listed the words from twelve different languages (at the time the UE consisted of twelve countries), and those words were completely different from each other. In that case, the technical innovation consisted of very simple, tiny little everyday objects that every popular culture named in its own way, not elaborate pieces of high technology like computers, which started as a government project never intended for ordinary household use.

  47. Hi all,
    Wow, that spawned a lot of discussion. Some very quick responses.
    1. A preprint of the paper is here.
    2. Re: Borrowing. We used identified cognate sets in basic or core vocabulary because it’s known to have much less borrowing (i.e. body parts, kinship terms, colors, etc). Moreover, the methods we used are really robust to the effect of borrowing and have known loan words removed. We’ve carefully tested the methods using simulations and shown that borrowing is a problem in some cases, but tends to shift the inferred age younger (i.e. borrowing makes languages more similar than they ‘should be’). Either way – it’s not going to support the Kurgan hypothesis.
    We also re-ran the analysis on a small set of extinct languages (which we might expect to have less loans than modern-day languages) and got the same results.
    3. re: uncertainty in the topology – our method integrates over the uncertainty in the tree. It’s not just one tree we used, but thousands. This is probably best seen in this image.
    4. re: quantitative approaches miss the nuances of the complex Indo-European history. Of course, we’re not saying that these methods can reveal everything about IE. They are good, however, for testing distinct hypotheses like the Anatolian vs. Kurgan origins.
    5. @Etienne. You’re bitter. Did I run over your dog or something?
    Simon

  48. Damn, sorry – that was the wrong link for the preprint. Email me (simon AT simon.net.nz) and I’ll send it to you.
    Simon

  49. Bathrobe linked to a Language Log entry titled Horse and wheel in the early history of Indo-European, in which Don Ringe comments on the non-Anatolian PIE word for wheel:
    “Reconstructable form: PIE *kwékwlo-s (masc.), collective *kwekwlé-h2 (→ neut. pl.).
    “Analysis: derived from *kwel- ‘turn’; pattern of derivation (reduplication + zero-grade root + thematic vowel) . . . ”
    ______________
    There’s a curious parallel in the Semitic languages.
    The Hebrew triliteral root g-l-l גלל means to roll. Hebrew גל gal means wave, billow. Hilly Galilee גליל, the ‘rolling region’, may be related.
    Arabic for wheel is ‘agele. Hebrew for wheel is galgal גלגל. Klein says the word is formed from glal גלל “through reduplication of the first two radicals and lit. meaning ‘something round’, ‘something rolled’. . . JAram. gilgila גלגלא, Syr. gigla גיגלא, Mand. girgla גירגלא and gargola גארגולא ( = wheel), JAram. galgla גלגלא, Akka. gaggultu ( = eyeball), all derive from the same base. ”
    The linguistic distance may be infinite; Mesopotamia is definitely but a short chariot ride from the Black Sea.

  50. Jim says: “One can construct English vocabulary for computers for the late 29th century. If English is 1,500 years older, why is there no archeological evidence for computers for those 1,500 years? The only claim you can make from shared vocabulary is that the language was unitary at the time the vocabulary came into use. In other words it says nothing about the history of the language prior to that.”
    If all the English dialects of 2011 started evolving separately and survived, a 29th century linguist may plausibly be able to reconstruct the proto-English of 2011, perhaps including a word for ‘computer’. Based on that evidence alone, one couldn’t say very much about the earlier history of English. If then some 29th century investigator used some statistical method to say that all these dialects diverged in, say, 1500, the same problem would arise — why is there no archaeological evidence for computers until 1960 or so?
    Back to the wheel, I found some comments on this in the Auckland group’s FAQ. That’s not detailed enough for me. If one wants to tell so many linguists they are wrong, that should be bolstered with just as much detail as the arguments that went into forming the hypothesis one is trying to knock down. Let’s see a paper on why the wheel words cannot be reconstructed for PIE (or PIE minus Anatolian), with a full treatment of the original reconstruction, and have it appear in a peer-reviewed journal with oppotrtunity for counter-comments. The Journal of Indo-European Studies does this sort of thing all the time.

  51. To make this less abstract, here‘s the entry for the roll/wheel root, from Pokorny’s comparative IE dictionary.

  52. For the variety in terminology on matches, see Wikipedia (check the links to other language Wikipedias).

  53. Very informative 2009 entry @ Languagelog, thanks! From there it appears that the analysis of historic phonetic changes can’t rule out borrowing of the words for wheel and horse into (or out of) proto-Anatolian or proto-Tocharian.
    @ Simon re: Re: Borrowing. We used identified cognate sets in basic or core vocabulary because it’s known to have much less borrowing
    At least one case of significant overestimation of the age of split in the paper seems to be Romani? It appears to be about 4 times older than the history would make us believe? And indeed it underwent extended incubation in the Balkan sprachbund, where all sorts of unrelated languages are known to have crossfertilized each other? Sounds like borrowing effect in action…

  54. marie-lucie says

    The root *kwel or *qwel for ‘roll, curve’, etc, which spawned many derivatives (words for ‘wheel’ being only some of them), is not limited to Indo-European and Semitic, but also occurs in some languages of the Americas which were unacquainted with the wheel until recently. At this point it is not possible to determine the significance of this fact (coincidence? migrations and borrowing many millennia ago? “natural human tendency”? etc etc).

  55. Bill Walderman says

    David Antony doesn’t just argue that the emergence of the wheel in the archeological record establishes a terminus post for PIE unity in the 4th millenium: he traces the archeological record of the domestication of the horse to that era. Without domesticated horses wheels for transport would probably not have come into existence, as they did not in the pre-Columban New World, where there were no native equids. And then he shows that the word for “wheel” must have been a PIE word, because the words in spatially distant daughter languages show the normal and regular sound correspondences.

  56. To Simon Greenhill:
    When someone points to a fatal flaw in an article of yours, and offers to supply references which could prevent you from making the same fatal blunder in future articles, you are of course free not to ask for said references.
    To rhetorically ask this critic whether he is bitter is indeed a sound move, inasmuch as it shifts attention away from yourself and your colleagues, whose (in)competence is the issue after all.
    However, to supply said critic with further ammunition is normally not considered a stellar move by specialists in academic debate and rhetoric.
    And that is exactly what you have done. Under 2, you spoke of selecting cognate sets from basic vocabulary because basic vocabulary is known to have fewer borrowings. The problem is that a “cognate”, as the word is used in diachronic linguistics, cannot BY DEFINITION be a borrowing. Your statement can only make sense if you are using “cognate” as a synonym for “look-alike”.
    In effect, you and your colleagues claim to be pioneering new methods of examining cognates in Indo-European languages, without in fact knowing what a cognate is.
    Would you bother reading a genetic study by a linguist who did not know what a gene is? Even if such a study were being trumpeted as cutting-edge by the NY Times? I certainly would not expect you to.
    Well, to try and keep this discussion civil, I trust you understand why in my opinion nobody should be expected to read your article either.
    To s/o:
    There have indeed been tests of glottochronological methods on pairs of languages whose history is known, and the results have done more to discredit the method than to make it credible. I believe that one scholar, comparing Spanish and Portuguese (neither of whose histories is in any way unusual, incidentally), calculated that their common ancestor must have been spoken in…1200 BC.

  57. I hasten to add that it shouldn’t be surprising if the Romani is an unusual outlier, with a much faster vocabulary turnover affecting even the words which tend to be persistent cognates elsewhere. So of course the Romani example can’t mean that the other dates are also off by several fold.
    It just means that the borrowing effect is real and measurable, and apparently relatively modest, but it leaves us free to speculate about its possible magnitude.

  58. I am not a linguist, but I’m not sure kinship terms are much less likely to be borrowed than other terms: atta, papa, arbi, familia, etc. Because most Indo-European laguages have inherited the same kinship terms, intra-Indo-European borrowings may be invisible unless they involve new words, new meanings, or strong sound shifts which prevent assimilation.

  59. J.W. Brewer says

    I don’t want to entirely repeat comments I made on the LL thread, but will restate here that one question I have about this sort of work that is not only interdisciplinary but published in a non-specialist journal like Science is who the heck Science uses as referees / peer reviewers, and whether they are the sort of people with enough background in linguistics to even notice issues like the dating points about Romany and proto-Romanian noted here and the non-standard classification of Frisian and Polish noted in the LL thread.

  60. There are probably lots of linguists of my generation who have a jaded view of quantitative methods applied to language evolution through having failed to get them to work for us. I had my glottochronology/lexicostatistics period around 1962, as an undergraduate, when I tried them out on the Ge language family, and mostly on the two languages Xerente and Xavante. Using the methods current then, those two languages appeared to have separated several thousand years ago, which was interesting in light of the writings of Curt Nimuendaju, who said the two tribes had lived together as one people almost within living memory. When I got to Brazil and could consult some Xerentes, I experimented by asking about some “basic vocabulary” from Xavante but with sound changes applied to give forms the pronunciation I would predict for Xerente. I did find some new cognates, but of course it was cheating to do glottochronology this way, and even so, the separation was still 700 years. Wildly wrong.

  61. Another question:
    This only includes currently-known Indo-European languages. Is it possible to accurately reconstruct these things without accounting for at least some of the practically-unknown Indo-European languages [Lusitanian, Ligurian, Illyrian, Thracian, Dacian, ‘Nordwestblock’? – Albanian is supposed to be related to one of the Balkan ones, but it’s unclear which]?
    The summary graph shows the Slavic languages diverging in the 3rd or 4th century, with the first split separating South Slavic from East and West Slavic, rather than the 6th through 8th centuries. But Slavic loanwords are surprisingly rare in Wulfilan Gothic which was spoken in the middle of that area in the 3rd and 4th centuries. I believe Alanic loanwords are more common. And Dacian ones? And Bastarnian ones if Bastarnian was not another East Germanic language? I don’t know how anyone could begin to identify these. If we had more of Dacian, that might help clear up the origins of the Slavic languages.

  62. @ s/o: “I assume there is enough of a record of that entire period that their assumptions and results could be checked.”
    Actually we have very little written evidence for spoken Latin/Romance for the whole first millennium CE. It was submerged in writing by the language of Virgil and Cicero (1st century BCE).

  63. about admixture with other extinct hominids, those are curious, i remember reading about those and getting an impression that the white race resulted from the interbreeding with neanderthals, while the asian race with another now extinct branch too, found iirc last year or the year before that somewhere in China
    i wonder whether that was a very wrong and stupid impression for me to get, or not, M(oskva)?
    well, people don’t say such things explicitly though, is my impression too browsing those, scientific blogs

  64. read, I could get you references if you want but you’re fairly close already. The publications said that modern Europeans and Asians have, on average, several % of their DNA coming from ancient non-human hominine species (Neanderthals mostly in Europe, but also in Asia; Denisovans mostly in New Guinea but also in Asia and to a lesser extent in Europe).
    Denisovans are a recently discovered species of hominines, as distant from Neanderthals as they are from us, named after Denisova Cave in Eastern Siberia.
    More recently, other scientists suggested that Neanderthal admixture may not exist for real; that instead, both Neanderthals and modern humans inherited the variation from the common ancestor. But the “ancient genetic variability” hypothesis doesn’t explain all the facts … by all accounts, the Neanderthal-like DNA in our genes does indeed look like an admixture, and its age looks right.
    Also recently, there was a discovery of non-human admixture in Africa, too. It is a bit more contentious because there is no ancient genomes of our sister species from Africa yet. In the climate of Africa, ancient DNA has few places where it could survive. So the researchers can tell that certain tracts of DNA look like they have been admixed from a non-human species, but they can’t tell which one.

  65. John Emerson says

    Haven’t read the thread yet, but I’ve seen a bold new computational approach to historical linguistics that missed the close relationship between Portuguese and Spanish. I think that they were computing alphabetical representations of vocabulary items, and that way, “falo” doesn’t seem much like “hablo”.

  66. John Emerson says

    The kurgan vs. Anatolia debate is an old one in archeology, by the way. It’s not biologists vs. the linguists or biologists vs. archeologists. What’s being claimed is that a new method has found new evidence supporting one of the existing theories.

  67. What’s being claimed is that a new method has found new evidence supporting one of the existing theories
    and of the one which seemed to be weaker; and there may be ways to reconcile the two theories; so rather than solving the controversy for good, the new approach just perpetuates it.
    but I’ve seen a bold new computational approach to historical linguistics that missed the close relationship between Portuguese and Spanish. I think that they were computing alphabetical representations of vocabulary items, and that way, “falo” doesn’t seem much like “hablo”
    New, perhaps. My kid did exactly that for his 5th grade Science Fair poster. It scored poorly, but not because the methodology wasn’t appropriate for a 5th grader. Rather, the jury just didn’t consider linguistics to be a science. Like, where is the experiment part, or the specimen collection part? Sheesh, tain’t science-worthy.

  68. J.W. Brewer says

    Having looked a little more at the supporting material, I’m now thinking that a lot depends on how plausible the random-walk/Brownian-motion models they used for the direction and speed of language spread/migration/divergence are. They do say that they tried a few variants (e.g. accounting for different views of how likely or not prehistoric people were to cross large bodies of water) and got similar results. But beyond that, I have no idea how to evaluate the accuracy of the model. What they seem to be saying is that, look, you can run the model a bunch of times and it doesn’t give you a unique reconstructed point of origin – it gives you a hundred different points of origin (no one of which is individually more plausible than any other one), a whole bunch of which are clustered close together in Anatolia and very few of which are in the steppes. That’s certainly an interesting result iff the model is a good one, but of course human history is the result of lots of contingent events which were not in each instance the more probable a priori outcome.

  69. This person ‘Pflaumbaum’, who quoted the very nice star constellation analogy: what’s the story with that name? Isn’t a plumtree eine Pflaumenbaum? I see from my enormous green Pons-Collins dictionary that Pflaume is also an inf. word meaning a dope or twit, or even cunt, but when I google it I find that Pflaumbaum is, possibly, a name used in Germany. Why the shortening, why not at least (one) Pflaumebaum?

  70. marie-lucie says

    JWB: a hundred different points of origin (…), a whole bunch of which are clustered close together in Anatolia and very few of which are in the steppes.
    It is known that a large number of languages were spoken in Anatolia in antiquity. Anatolia is a peninsula with a long coastline and a relatively short land “border” with the Asian continent. It also comprises several regions which have ‘natural boundaries’ (such as those of the kingdom of Hatti), where linguistically distinct populations could coexist while occupying separate territories. In contrast, wide-open regions like the steppe offer little obstacle to population movements or even simply widespread trade, and such regions tend to end up with a single language, or at least few languages, spoken throughout. Could this be a factor in the results obtained by the team?

  71. J.W. Brewer says

    m-l: I have not seen a sufficiently-detailed description of how their model(s) of “random-walk” movement accounted for factors like that (and I quite possibly would lack the technical competence to understand the description anyway) to have any insight into your excellent question. But let me quote what I thought was a worthwhile paragraph from a comment thread at the dianekes blog:
    “The study’s methods seem to implicitly assume a slow gradualist diffusion model when the reality was probably much more dramatic and punctuated. The archaeological record shows long periods of continuity interrupted by disruption followed by rapid expansion of new cultures often lots of places at once.”
    I am struck by the number of attested IE languages that (w/o even getting into post-1492 worldwide European colonial/imperial expansion) arrived in particular locales via circuitous routes. A modest example is that while Celtic was in present-day France before it was in the British Isles, Breton is not a pushed-to-the-margins relict of the old continental Celtic but the result of reimportation across the Channel. Then consider how 1500 years ago there were one or two Germanic languages spoken in North Africa (Vandalic and maybe Alannic) and how ex ante improbable was the route that took them and their ancestors from the PIE Urheimat to modern-day Tunisia. Finally, consider how a hundred years ago the Romance-speaking population of the Southern Balkans was divided between Vlachs speaking some version of Romanian and Sephardim speaking Ladino, which . . . had arrived there by a very different route. Why should the history of population movements in earlier millenia be any less weird in some of its details?
    I suppose in fairness to the Anatolian-origin hypothesis, the Hittites et al. could have been like the Bretons and returned to an ancient ancestral homeland after a considerable intermediate period living elsewhere rather than having been there continuously.

  72. Isn’t Alanic usually believed to be Indo-Iranian? Also some personal names from the area appear to be Indo-Iranian. Farnobius may be a Goth, or maybe not, but the name may parallel Pharnabazus, with the same root, Farnah-.

  73. Marie-Lucie: the article was written by “scholars” who plainly have no idea what they are doing (as I pointed out above, they clearly do not understand what a “cognate” is), so I wouldn’t try interpreting their results as being related to the geography of Anatolia.
    J.W. Brewer: My compliments on your examples, they are excellent. The spread of Romani in Europe would be another case in point of the spread of Indo-European throughout Europe via indirect routes. It has also long been noted that some words in various Indo-European languages seem to derive from otherwise lost branches of Indo-European, giving us today a small idea of the highly complex linguistic prehistory of Europe.
    One of the reasons why I disbelieve the “Anatolian homeland” scenario, in fact, is because neither in Hittite nor in other Anatolian languages has any scholar pointed to evidence (in the form of place-names or loan words) of there ever having existed a branch of Indo-European other than Anatolian in Anatolia (leaving aside later Indo-European languages such as Phrygian or Greek). This is wholly unexpected if Anatolia is the Urheimat, and thus the place where Indo-European first began to diversity.
    This situation is quite unlike what we find in Brittany, where the Celtic place-names have undergone (Romance) sound changes that are wholly unlike the sound changes of these same elements in Breton (+ Cornish and Welsh) itself. Even if we did not have Welsh and Cornish as comparanda, this split in the historical phonology of Celtic elements in Brittany would point to Breton not being the first Celtic language spoken in Brittany, and thus to its being intrusive in an area where Celtic had earlier been spoken.
    Short of similar findings in Anatolia, it seems simplest to assume that Hittite and its Anatolian sisters were not originally indigenous to Anatolia, and are thus the result of a migration to an area where Indo-European had previously never existed.

  74. thanks, DP(M)!
    i’m glad i haven’t misunderstood the reports’ implications, having the innuman genes doesn’t make people any less human 🙂 or racist
    other specialties people being interested and proposing their own models of research is a thing to welcome, not condemn, it seems to me, in any field of science, the more collaboration the better, no? archeologists finding the artefacts, physicists carbon dating, biologists genotyping, linguists analyzing the linguistic data, somebody surely needs to tie all the findings up and propose a new theory or confirm the existing one, does it matter who is then, biologists or linguists?
    all is connected, to learn more about the subject and just should help out each other with the missing or incorrect parts, not negate each others’ findings or theories just on one hearing imho
    i root for the whatever underdog usually so i’m on the kurgan side of the debate though, nomadic cultures must be were preceding the settled agricultural cultures, so they seem to be like logically the source of diversion and spread of languages, just a speculation of course

  75. marie-lucie says

    Etienne, I don’t think much of their “scholarship” either, but I wondered if the bias towards Anatolia in the results were simply due to the fact that there were more languages (including IE languages) in Anatolia than on the steppe in ancient times, something quite likely related to the g.
    read: I agree that interdisciplinary collaboration is a good thing in general, but the present case is not one of “collaboration” between biologists and historical linguists on an equal basis, but biologists trying to solve a language problem with only minimal knowledge of historical linguistics. It is true that they have one linguist on their team, and he is very competent in his own field but he is not specially trained in historical linguistics. I should add that historical linguistics is not a fashionable specialty in linguistics nowadays, so that many otherwise competent linguists have not had any training in this specialty. Several commenters here, who are not professional linguists, know much more about historical linguistics than the team in question.

  76. ok, their oversight was not to include a historical linguist on their team, but then the study itself wouldn’t have been finished perhaps, then what is better to have a study with however not all correct results, if everything can be all correct, or fail the study from the beginning since their model wouldn’t fit the accepted historical linguistics conventions
    well, i have no any that, “beef’ in the discussion, so hojil xoer tiishee (may both sides win)

  77. hojil xoer tiishee (may both sides win)
    A nice saying!

  78. M-L,
    “the spread of technical words, or not
    Example 1:
    English: computer
    Spanish: computador(a), ordenador
    French: ordinateur
    Swedish: dator
    These are the ones I know offhand, without searching through dictionaries. None of them is an approximation of the pronunciation of the English word. Astonishingly, the Swedish word is not quite the same as the Norwegian one.”
    But those examples make my point. The Swedish term is derived from an English computer term. The spanish term is a claque. The French term is not only different, but as you know, it is intentionally different, like ‘Fernsprecher”, and if we looked through the leicons of those two languages from that period, we would find lots of examples of that kind of Langism.
    if you want to see truly independent vocabulary, you find them in Chinese – http://www.mdbg.net/chindict/chindict.php?page=worddict&wdrst=0&wdqb=computer

  79. “If all the English dialects of 2011 started evolving separately and survived, a 29th century linguist may plausibly be able to reconstruct the proto-English of 2011, perhaps including a word for ‘computer’. Based on that evidence alone, one couldn’t say very much about the earlier history of English. If then some 29th century investigator used some statistical method to say that all these dialects diverged in, say, 1500, the same problem would arise — why is there no archaeological evidence for computers until 1960 or so?”
    YM, why would you expect archeological evidence of something from before the time it existed, and what could its absence tell you about the linguistic unity of that language?
    Let’s say we know for a fact that Greek existed as a separate language before the introduction of iron, and that after that Greek and its various dialects used the same word (I don’t know if they do.) Why would we not expect to find evidence of iron throughout the entire period of Greek linguistic unity?
    And here’s how it can get weirder. take my Muskogean example above. we’re pretty sure form various lines of evidence that these langauges split up long before they various groups started growing corn. yet they all have terms that look like they derive forma comon ancestral form. What gives. One explanation is that that term refers ot some other referent. another is that corn was used in some minor way in the ancestral culture, maybe as part of an imported religion. But either way, it is post-dates the period of linguistic unity.
    Speaking of corn, there is a a simiaalr puzzle in Uto-Aztecan, and this one has been politicized a bit. They all share some form of the word ‘tamal’ as in tamales, even the ones that did’t grow corn at all, like the Utes and Paiutes.

  80. Trond Engen says

    That LL thread is a slaughterhouse. I added to it.
    American language families are shallower than Indo-European, allowing loans to spread as calques through visible cognates. This could well have happened in early Indo-European too. It adds some uncertainty to archaeological dating of linguistic branching, but I don’t think it matters much as long as one is aware of it. For such calquing to happen, the languages have to be close, both linguistically and geographically.
    There’s also the inverse problem that innovations may date to a period well before a socio-cultural split. These days there’s a Kentum/Satem divide running through Oslo — no, straight through many a Norwegian home.

  81. @Jim
    The French term is not only different, but as you know, it is intentionally different, like ‘Fernsprecher”…
    For the record, according to Wiktionary:
    [Ordinateur]
    “In its application to computing, it was coined by the professor of philology Jacques Perret in a letter dated 16 April 1955, in response to a request from IBM France, who believed the word calculateur was too restrictive in light of the possibilities of these machines (this is a very rare example of the creation of a neologism authenticated by dated letter.)”
    Moreover:
    “IBM France retint le mot ordinateur et chercha au début à protéger ce nom comme une marque. Mais le mot fut facilement et rapidement adopté par les utilisateurs et IBM France décida au bout de quelques mois de le laisser dans le domaine public”
    …and if we looked through the leicons of those two languages from that period, we would find lots of examples of that kind of Langism
    Langism?

  82. Just noticed on Alexei Kassian’s blog that he argues against Steppe origin of the IE because, i.a., the cognates supposedly don’t include words for the nomadic, Steppe-specific basic terms such as flora / fauna, yurts / round tents, distances measured in trip days… The ancestral land must have been a forested highland, he says. Any thoughts of the validity of these arguments? BTW I noticed on LanguaLog that even horses were known as mountain donkeys in the Upper Mesopotamia.

  83. Trond Engen says

    Dmitry: I don’t read Russian, so I can’t add much about the specifics. Those seem to be worthwhile speculations, especially when based on positive evidence. For the negatives it’s hard to avoid the absence of evidence vs. evidence for absence thing. Words for a concept are lost when the concept itself is forgotten. And, anyway, I don’t think anyone suggests that the Protos led a full-blown nomadic life on the Pontic Steppe, rather a form of semi-nomadism with farming in the river valleys and seasonal migrations with animals. This developed over time, differently in different regions, and it became more nomadic in the dryer, eastern regions inhabited by what was to become the Indo-Iranians.

  84. Just another sidebar”:
    Sometimes absence of evidence is evidence of absence. For example, let’s say you have the hypothesis that Jimmy Carter was a member of the American Nazi Party. You search in vain for references in the newspapers. You interview his friends, but they can tell you nothing. Somehow, you get a hold of the complete membership rolls of the ANP, but disappointingly, the name you are looking for is just not there.
    What do you conclude? That there is no way to tell if Carter was an American Nazi or not? Or do you conclude, more reasonably, that your original hypothesis was wrong?

  85. Trond Engen says

    JC: Granted. It depends on the completeness of the evidence. Absence from complete membership rolls is very close to definitive evidence of absence. (Yes, I know you were making a point.)

  86. Trond: Yes, I certainly went over the top there. But to conclude there is no tooth fairy we need not examine every place where he or she might be hiding.

  87. Trond Engen says

    The confidence fairy, on the other hand…
    I do agree that the “absence of evidence” line is often applied too lightly. Or rather too dismissively. It’s a truism, but the cases it comes handy are the cases that are unlikely to have a clear answer anyway — at best a parsimonous, but essentially preliminary, explanation. If that’s what you meant.
    In the case of speculations about the IE homeland, I didn’t mean it dismissively. What I mean to say is that words and whole semantic fields can be lost because of a change in environments or lifestyle, so actual shared vocabulary have much stronger explanatory value.

  88. “Langism?”
    Jack Lang, the one-man linguistic immune system.

  89. I decided to check the cognacy database from the Bouckaert et al Science paper; in the online supplement they describe how they extracted these cognate lists from the earlier publications, corrected massive errors, and weeded out the borrowings. I hoped to find at least some evidence for or against antiquity of the “Steppe-words”.
    But the resulting database leaves a very depressing impression. It doesn’t have any fauna words apart from two species, dog and louse (and its treatment of dog/bitch/hound or louse/nit variability is pitiful IMVHO, with a decision on what cognacy grouping to assign a language following very capriciously from the choice of just one of the related words). So some Slavic languages go with “pes” male dog, and others with “sobaka” female dog (despite having both forms). Not even to mention that “sobaka” must be a Turkic borrowing … and that so many cultures have tabooed the dogs that it could have caused runaway euphemization.
    No “steppe concepts” in the list, but at least one “marine concept”, Sea/Ocean, which must have been heavily borrowed or calqued by all the landlocked languages. And indeed the database has Turkic and Arabic borrowings (like Ossetian dengiz or Dari bahr) as well as Greek (like Farsi oqyanus). But many attested cognates meaning something lesser than a sea go missing (like Latvian maria / Prussian mary “sea bay”, English marsh, and perhaps even Sanskrir maryada which stands for a limit or a boundary but also for seashores and riverbanks).
    Of course with its 200-odd meanings, even picked haphazardly and with capriciously omitted cognates and unrecognized borrowings, the sheer volume of the database might have mitigated some of its problems. Still we are taught “GIGO”, if you know the acronym. So it’s kind of depressing to watch.

  90. Yup, that’s depressing, and it sounds like GIGO pretty much covers it.

  91. Jack Lang, the one-man linguistic immune system
    Jack Lang was a child when “ordinateur” was coined and was not even born for “Fernsprecher”. So why the term “Langism”?
    There is this other Jack Lang of course.

  92. I’ve been thinking… The paper states that the model is robust to tweaking of variables, e.g. different, eh, conductivity in different regions, but I can’t see how that can be. A change in resistance to movement plainly has to shift the current, and if it doesn’t, in some way or other, then there’s something fishy about the model. Now, the end results may well converge in more or less the same spot anyway, because of the supposed age or something, but I’d like to see that happen before I believe it.

  93. And then there’s this Jack Lang, author of The New York Mets: Twenty-Five Years of Baseball Magic.

  94. Trond Engen says

    Here‘s Martin Lewis of GeoCurrents with a thorough takedown. Alexei Drummond replies in the comments.

  95. I’d missed their missing Russia. I had noticed, though, that Anatolia is about the middle of the geographic distribution of indo-european languages as they mapped them, so I wonder if their model just can’t help walking the proto-indo-europeans back to there? That seems too obvious, though.

  96. “Fresh from the presses”: Peas and lentils against horses and wheels? Mikić 2012 uncovers numerous PIE words for pulses, which readily lends itself to the same hypothesis of PIE arising among agriculturalists rather than Steppe pastoralists … therefore, likely outside of the North Black Sea Steppe belt? Admittedly the maps in the paper place PIE root squarely to the North of Black Sea.
    Given that we also couldn’t exclude borrowing of the famous horse / wheel words ~ 5 kya, and couldn’t identify Steppe wildlife / nomadic lifestyle words in PIE, the idea of the Steppe origins might not be literally true? Although perhaps geographically close to being true, if we include the woodland fringes of the Steppes.
    But can we apply phonetic shift analysis to the agri-words in Mikić 2012? There are numerous roots with distinct phonetic patterns, perhaps some of them are informative enough to date them back to 8-9 kya?

  97. The paper link kept giving me “questionable content” 🙁 Couldn’t even paste the title of the paper

  98. Origin of the Words Denoting Some of the Most Ancient Old World Pulse Crops and Their Diversity in Modern European Languages
    Aleksandar Mikic´, Institute of Field and Vegetable Crops, Novi Sad, Serbia
    Abstract
    This preliminary research was aimed at finding the roots in various Eurasian proto-languages directly related to pulses and giving the words denoting the same in modern European languages. Six Proto-Indo-European roots were indentified, namely arnk(’)– (‘a leguminous plant’), *bhabh– (‘field bean’), *eregw[h]– (‘a kernel of leguminous plant’, ‘pea’), ghArs– (‘a leguminous plant’), *kek– (‘pea’) and *lent– (‘lentil’). No Proto-Uralic root was attested save hypothetically *kac.a (‘pea’), while there were two Proto-Altaic roots, *bo.krV (‘pea’) and *zi.a.bsa (‘lentil’). The Proto-Caucasianx root *qi.r’a¯ denoted pea, while another one, *ho¯w.(a) (‘bean’, ‘lentil’) and the Proto-Basque root *i.ha-r (‘pea’, ‘bean’, ‘vetch’) could have a common Proto-Sino-Caucasian ancestor, *hVw.V (‘bean’) within the hypothetic Dene´-Caucasian language superfamily. The Modern Maltese preserved the memory of two Proto-Semitic roots, *’adas.– (‘lentil’) and *pu¯l– (‘field bean’). The presented results prove that the most ancient Eurasian pulse crops were well-known and extensively cultivated by the ancestors of all modern European nations. The attested lexicological continuum witnesses the existence of a millennia-long links between the peoples of Eurasia to their mutual benefit. This research is meant to encourage interdisciplinary concerted actions between plant scientists dealing with crop evolution and biodiversity, archaeobotanists and language historians.

  99. The paper link kept giving me “questionable content” 🙁 Couldn’t even paste the title of the paper
    Weird, I posted it with no problem. Don’t know what happened, but I apologize for the vagaries of MT-Blacklist.

  100. Trond Engen says

    I thought the pulses were supposed to be Wanderwürze.
    I wonder, by the way, if pulse is the origin of Scand. pylsa/pølse f. “sausage”, etymology unknown. English pulse is from Lat. puls “thick gruel” and ultimately from Greek poltos “porridge”.
    If puls was used for the gooey of ground meat, pylsa could be a straightforward feminine derivation used for the derived product. English pulse might then have been applied to legumes for their likeness. Or conversely, if Eng. pulse derives straight from Lat. puls, pylsa might have been applied to sausages for their likeness to legumes. Anyway, pølse can mean anything of sausage-shape, like sjøpølse and a vindpølse, so a legume is well within the range.

  101. Pulse came through Old French on its way to Latin, but without significant change.

  102. Trond Engen says

    Thanks, John. I suspected as much, but didn’t find it at CNRTL. But of course it’s in the OED.
    Old French puls f. “porridge” > some Germanic *puls f. “forcemeat” > Scand. pylsa f. “sausage”
    It’s so obvious it hurts. As it must have been to better etymologists than me for a long time. But since that middle stage probably isn’t attested anywhere, they wisely stopped.

  103. I meant, of course, dammit, from Latin.

  104. I figured you meant “toEnglish […] without significant change from Latin”.

  105. Just so.

  106. Trond Engen says

    The semantic gap between Old French puls and English pulse is so wide that I can’t understand how anyone would dare making it in one leap. Is there an older sense of the word in English?

  107. Trond Engen says

    But I meant to jump in to say that GeoCurrents’ takedown is progressing with one post a day. (I should perhaps let Asya Pereltsvaig do her own advertising. She has quoted commenters here in both her posts!)

  108. Boy, she’s really letting them have it. Go, Asya!

  109. Pease porridge hot,
    Pease porridge cold,
    Pease porridge in the pot,
    Nine days old.

  110. Trond Engen says

    Pease porridge, of course. But I see no evidence that OF puls was used for mashed legumes. And even if it were, it’s still a leap from mashed legumes to the wrapping around the unmashed legumes. This is not to say I distrust the etymology — the word is there in a likely source language at a likely time — but it does suppose a couple of intermediate steps. Those can only be inferred, but if we accept the Scand. word as cognate, there’s independent support and a chance to triangulate to a better approximation of the path. What I mean to say is, both the English and the Scand. words are hard to reconcile with OF on their own, but taken together the case looks stronger.

  111. Trond Engen says

    That came out a bit too strong. I meant “I don’t see much more evidence that OF puls was used for mashed legumes than for mashed meat.”

  112. (This is very far out of my field, I studied physics and now do finance, but) I thought we were taking the english “pulse” as a loan, not a cognate of the french? A saxon kitchen servant cooking “puls” for her Norman overlord would be just as likely to apply the word to the pease as the porridge–that much seems like a small leap. But that still assumes the porridge was made of peas, which I can’t really assume. Although peas are easier to come by than meat.

  113. Trond Engen says

    And I’m a structural engineer, and we discuss this in a thread about non-linguists doing things they shouldn’t.
    Was I unclear? Let me try again. Eng. pulse is taken as a loan from French, and I won’t shallenge that. I jump in to say that I think Scand. pylsa might be a loan indirectly from the same source. Later I add that the case for pulse looks stronger if pylsa is taken into account.
    But I was wrong in assuming that pulse primarily designates the whole pod with seeds. That takes power out of my lateest addition, and I’m back to the unattested “porridge” -&gt “stuffing” -&gt “stuffed intestine”. There’s still a parallel in the semantic widening from “porridge”, but nothing definitive.

  114. Martin W. Lewis and Asya Pereltsvaig of GeoCurrents have now run their takedown series for three weeks. Well worth reading, but that’s not why I asked Hat to reopen the thread. I want to tout something from a comment to a recent post:
    Jaakko Häkkinen, a Finnish linguist at the University of Helsinki, links to his own paper Problems in the method and interpretations of the computational phylogenetics based on linguistic data – An example of wishful thinking: Bouckaert et al. 2012. The paper contains some fresh developments in the study of Uralic that I found very interesting.

  115. marie-lucie says

    Trond, thank you for the link to Jaakko Häkkinen’s article, which I highly recommend.
    The article is quite dense, and it might have helped me to know more about Uralic or Anatolian, but for me the main interest of the article lies in the detailed explanation of the reliability (or not) of the methods of lexical vs phonological comparison, an explanation which is valid for comparison and reconstruction in any languages or families and which touches on some important topics rarely mentioned in historical linguistics textbooks.
    The lexical method is so fraught with errors that I am amazed that it is still being used and even recommended as a primary method of historical study. Common vocabulary is important for reconstructing the environment and the way of life of speakers in the proto-language period, but the words have to be properly reconstructed on the basis of phonological correspondences between their descendants in related languages, something which is much harder to figure out than whether two languages once had a common word for a certain animal, object or concept.
    Among other topics, the article also mentions the problems of locating a “homeland” for the proto-language, since the geographical centre of a linguistic area is not necessarily the cradle of its proto-language.

  116. I second Marie-Lucie’s thanks and her recommendation of the article, which I found very informative, well-argued and clearly written: a rare combination indeed (Oh, and I second Hat’s comment on another thread: Welcome back, Marie-Lucie!).
    It drives yet another stake into the corpse (which should be dust by now –sorry, I just saw an excellent vampire movie again and the metaphor comes naturally to me…) of the biologists’ “work” by pointing out, INTER ALIA, that the Indo-European and Indo-Iranian loanwords in Uralic are incompatible with either an Anatolian homeland or a spread of Indo-Iranian from Anatolia to the Indian subcontinent via Iran.
    It should be stressed, of course, that the biologists are only practicing “the lexical method” loosely: the lexical method requires that only true cognates be used, as Marie-Lucie just pointed out, and as I believe I had pointed out above, the biologists literally do not understand what a cognate is.

  117. Where’s David when the world needs him most? But speaking of corpses, how about that reference to the “Gray School”?

  118. Trond Engen says

    marie-lucie: The article is quite dense, and it might have helped me to know more about Uralic or Anatolian
    It’s dense and takes some knowledge of Uralic for granted, but it gave an instant idea of state-of-the-art Historical Uralics. And, yes, good to have you back!
    Etienne: It drives yet another stake into the corpse (which should be dust by now […]) of the biologists’ “work”
    Yes. I almost wrote, and in fact I think I did someplace else, “The paper is interesting, and not only for yet another takedown …, but…”
    the biologists literally do not understand what a cognate is.
    Today Asya sums up her critique of Quentin Atkinsons serial founder effect in phoneme inventories. Here’s a line:

    It appears from Atkinson’s writing that he takes phonemic variation to run parallel to genetic variation, thus revealing an egregious lack of understanding of what a phoneme is, and perhaps of what genetic variation is too.

    Source: Source. (Their blogware automatically adds a sourcelink to copied material. Let’s see how it works when I put it between %lta&gts.)

  119. Trond Engen says

    That worked extraordinarily bad even for me. I won’t even ask for magic. I’ll repeat the whole thing:
    marie-lucie: The article is quite dense, and it might have helped me to know more about Uralic or Anatolian
    It’s dense and takes some knowledge of Uralic for granted, but it gave an instant idea of state-of-the-art Historical Uralics. And, yes, good to have you back!
    Etienne: It drives yet another stake into the corpse (which should be dust by now […]) of the biologists’ “work”
    Yes. I almost wrote, and in fact I think I did someplace else, “The paper is interesting, and not only for yet another takedown …, but…”
    the biologists literally do not understand what a cognate is.
    Today Asya sums up her critique of Quentin Atkinsons serial founder effect in phoneme inventories. Here’s a line:

    It appears from Atkinson’s writing that he takes phonemic variation to run parallel to genetic variation, thus revealing an egregious lack of understanding of what a phoneme is, and perhaps of what genetic variation is too.

    Source. (Their blogware automatically adds a sourcelink to copied material. Let’s see how it works when I put it between &lta&gts.)

  120. New link to the “Problems of phylogenetics” article.

    And while I’m at it, this is a test to see if bare < is automatically escaped by WordPress, which if so would mean the end of things like "French je < Latin ego” being wrecked in the comments unless carefully escaped.

  121. Yesss! It works. Just make sure < is delimited by spaces, and all will be well.

  122. If he ever got around to taking a first course in historical linguistics, he’d surely hear about that newfangled wave model.

    This seems to be the right place to link a truly wonderful article, not only because its conclusions are (I think) exactly right, but because of the compelling verbal and graphical way in which they’re expressed. (Deep breath.) “Trees, waves, and linkages: Models of language diversification” by Alexandre François is Chapter 6 of The Routledge Handbook of Historical Linguistics (2014) gives us the linkage model to replace both tree and wave models by subsumption. A linkage is the result of changes that spread as waves (i.e. independently) through different parts of a speech community, after the community has broken up into mutually unintelligible languages. The result is inherently un-treelike, because whereas languages A, C, and E may share one feature, languages A and D may share another, and B, C, D, and F a third. Trees, in fact, are special cases of linkages in which the scope of every change is neatly nested inside another change (or not nested at all) without overlaps.

    François’s argument is that we like trees because they are easy to draw, and therefore we systematically and half-consciously exaggerate the extent to which trees actually fit the data. (Students of textual apparatus have noticed the same problem many times: plays can be organized by act and scene, by line of verse, by speaker, by printed page, and in other ways, and these can overlap.) Historically, the tree model has been strongly associated with the comparative method; the inventor of the wave model was a skeptic. But François summarizes his article better than I can:

    The present chapter will discuss the strengths and weaknesses of cladistic [meaning ‘tree-like’] representations for modelling processes of language diversification, and examine alternative approaches for capturing the genealogy of languages. In section 2, I will first summarise the way in which linguistic trees are typically understood, before examining their underlying assumptions. Section 3 will examine the processes that underlie genealogical relations between languages, and explain why the Tree Model is most often unsuited for representing them. While the Comparative Method must be preserved for its invaluable scientific power, a rigorous application of its principles in situations of linkage in fact disproves the Tree Model, and favours the Wave Model (section 3.2) as a more accurate description of the genealogy of languages.

    Non-cladistic models are needed to represent language relationships, in ways that take into account the common case of linkages and intersecting subgroups. Among existing models, Section 4 will focus on an approach that combines the precision of the Comparative Method with the realism of the Wave Model. This method, labelled Historical Glottometry (Kalyan and François forthcoming), identifies genealogical subgroups in a linkage situation, and assesses their relative strengths based on the distribution of innovations among modern languages. Provided it is applied with the rigour inherent to the Comparative Method, Historical Glottometry should help unravel the genealogical structures of the world’s language families, by acknowledging the role played by linguistic convergence and diffusion in the historical processes of language diversification.

    Read the article. It is less than 30 pages (plus apparatus), and boy, does it deliver.

  123. marie-lucie says

    Alexandre François studies the languages of Vanuatu, one of the South Seas island nations with multiple languages spoken in a small territory. I heard present his interesting model at a conference last summer, after reading the paper in question.

    Actually, the “tree” model should be more accurately called an “espalier” model (i think I have seen the word in an English language article), because it is two-dimensional while real trees are three-dimensional unless they are “trained” to grow flat against a wall (a method used with some fruit trees). This flat tree model can deal with “vertical” relationships (of past to present stages) but not “horizontal” ones which would require crossovers between non-adjacent nodes (eg A-D rather than just A-B, C-D, etc).

  124. Apparently “espalier” (a word I have never spoken) is pronounced is-PAL-yer in (US) English, which would take me some time to get used to. It reminds me of learning that Lawrence Olivier pronounced his own name similarly anglicized, and had to get used to accepting the Frenchified ending other people insisted on.

  125. David Marjanović says

    “Cladistic” doesn’t mean “treelike”. It refers to a method – or several – of doing phylogenetics, long after the assumption that the phylogeny is reasonably treelike has been made. I’ll read the paper, but getting basic terminology wrong does not earn any brownie points.

  126. David Marjanović says

    Right there on the first page: “cladistic (tree-based) representations”. Arrrgh. I’m imagining a Willi Hennig Society mob running after the author with torches & pitchforks.

  127. David Marjanović says

    The other terms from biology are correctly used, except “drift”, which is of no consequence. It’s strange, however, that the author on the one hand cites the original work of Leskien (1876) for the principle of only using shared innovations in phylogenetics on the linguistic side, while on the other hand citing a recent textbook (Page & Holmes 2009) for the same principle on the biological side instead of the original work of Hennig (1950). Hennig is the one who invented all those Greek terms like “synapomorphy” and “symplesiomorphy”.

    Anyway, historical glottometry looks like it holds great promise. Still, fig. 6.5 – while certainly not a tree! – looks much more treelike than I expected; I take from this that in many cases a tree will be a good approximation for many purposes.

    Best title in the references: “Where *R they all? The geography and history of *R loss in Southern Oceanic languages.”

  128. Where *R they all is a beautiful paper, and worth taking a look at.

  129. I noticed that too, but then I realized that clade just means ‘branch’, and that he’s entitled to define technical terms however he wants, especially in a paper making no reference to biology.

  130. Marie-Lucie, I am not sure how much the geometry of trees that you describe is relevant to language or biology, but usually, unless we are talking about real arbors, tree means a logical structure quite apart from any possible geometry.

  131. espalier

    Not really so different from lavalier (microphone), one hung around or near the user’s neck, which is pronounced /ˈlɑvəlir/ ~ /ˈlævəlir/, though ultimately derived from the title of the Duchess de la Vallière.

    loss of *R

    There is a discussion of cases in which certain North-Central Vanuatuan languages have two words with the same Proto-Oceanic etymon, one with /*R/ > zero, the other with /*R/ > /*r/ > whatever the usual outcome of /r/ is in that language. This instantly reminded me of our discussion of six English words all traceable to Latin discus through different paths: dish, desk, disk, discus, dais, disco.

  132. I like band, bend, bind, bond, bund.

  133. Not really so different from lavalier

    Another word I’ve barely heard of and would have no idea how to pronounce.

  134. marie-lucie says

    DO: I am not sure how much the geometry of trees that you describe is relevant to language or biology, but usually, unless we are talking about real arbors, tree means a logical structure quite apart from any possible geometry.

    I realize that, but the tree metaphor (like any metaphor) is inspired by the perceived similariy of an abstract concept or system to a natural one, and strict adherence to the metaphor and its graphic representation may focus the scholar’s attention to some features or phenomena while diverting this attention away from others which could be just as important.

    In historical/comparative/classificatory linguistics, the ancient metaphor of the “family tree” showing the descendance of the most ancient known ancestor is used to represent the “descent” of “related” languages from a common ancestor. But where the tree model of linguistic relationship breaks down is that it assumes that after a “branching” occurs the “branches” not only have no more contact with each other and develop independently (something which is not always true), but that the position of the “branches” on the tree representation accurately represents the relationship between the languages, which in fact might still share some innovations with each other in ways that are incompatible with the strict separation of branches and subbranches on the two-dimensional “tree”.

    This is why other models have been proposed, such as the “wave” model and more recently the more ambitious and comprehensive one put forward by Alexandre François.

  135. band, bend, bind, bond, bund

    You can actually list band twice, because the meaning ‘flat strip’, without reference to tying anything, comes from Anglo-French bende, CF bande, itself of obvious Germanic origin. In ME it was spelled bande, whereas the native word was spelled band, but with the fall of -e the words have merged completely.

  136. marie-lucie says

    JC: What is “CF bande”? la bande meaning ‘flat strip, tape’ is a perfectly good word in all kinds of French, I think. It refers to the kind of flexible strip used for a bandage for instance (another French word), or for ornament or reinforcement on a piece of clothing or upholstery. It also means a tape in a tape-recorder, or a strip in a comic strip (une bande dessinée).

  137. band, bend, bind, bond, bund

    Per AHD: [Hindi band, from Persian, from Middle Persian, from Avestan *banda-, from Old Iranian; see bhendh- in Indo-European roots.]

  138. CF = “Central French”, i.e. Parisian French, a distinction that always has to be made when dealing with early borrowings from French into English, where many words are from Norman French but many are not. Heraldry uses the Norman form bend for the diagonal strip running across a shield. The modern English senses ‘strip of color’ and ‘portion of the radio spectrum, e.g. AM, FM, short wave’ descend from this French sense, whereas all but the most literal sense of native band ‘object for tying with’ have moved to its phonetic variant bond.

  139. David Marjanović says

    he’s entitled to define technical terms however he wants, especially in a paper making no reference to biology

    Well, sure… but that doesn’t make it a good idea. Plenty of confusion has resulted from the fact that evolution is an established technical term for so many entirely different things in different sciences.

  140. Carburet(t)or actually originated in English. The ending -uret, which was originally nominal but could then be verbed, was applied in older English-language chemistry starting around 1790 to form the names of simple two-element compounds, where -ure was and is used in French, reflecting an underlying neo-Latin -uretum, -oretum. Thus a sulphuret was a simple compound of sulfur, notably hydrogen sulfide, which was formerly called sulphuretted hydrogen.

    A carburettor, therefore (carburator, carburetor are typical AmE simplified spellings, of which the second is now standard) is a device for combining oxygen with hydrocarbon fuel. The -uret ending has now been displaced by -ide in languages other than French, and carburetors themselves have mostly been displaced, except in very small and very large internal combustion engines, by fuel injection.

  141. January First-of-May says

    I’d rather expect carburator to be an attempt to re-parse carb-uret-or (possibly already synchronically opaque by then) as **carbur-ator, with the common Latinate suffix -ator added to an imagined stem **carbur-.

    The only Russian spelling I’m aware of, карбюратор, presumably originates from a word spelled carburator; not sure if a German or French origin is more likely.

  142. David Marjanović says

    If not English, then French; German can’t explain the ю.

Speak Your Mind

*