Peter Forster and Alfred Toth, two geneticists who know nothing about linguistics, have written a paper, “Toward a phylogenetic chronology of ancient Gaulish, Celtic, and Indo-European,” that purports to redraw the family tree of Indo-European, regroup the Celtic languages, and establish far earlier dates for the break-up of PIE and the split of Insular from Continental Celtic. Fortunately, the world will not have to rewrite its textbooks; Larry Trask has done such a thorough job of trashing their methods in LINGUIST List that if there is any justice the names of the perpetrators will never be heard of again outside their specialty. A sample:

What they do is to appeal to an unexplained and wholly subjective
notion of “similarity”. Two items are assigned to the same state if the authors judge them to be similar, but to different states if the authors judge them to be dissimilar. Let’s see what that means in practice.
Latin filia ‘daughter’ and its Spanish descendant hija are assigned to different states, because the authors judge tham to be dissimilar. But the Gaulish inflected form teuo- ‘to gods’ and the Scottish Gaelic prepositional phrase do dhiadhan are assigned to the same state, because the authors judge them to be similar. Why are they similar?
Breton forn ‘oven’ is assigned to the same state as Spanish horno, but to a different state from Irish sorn. Italian e ‘and’ is assigned to a different state from its Spanish cognate y, but to the same state as the unrelated Basque . (Spanish y has a positional variant e, but apparently that doesn’t matter.) On the other hand, the Gaulish genitive suffix -i is assigned to the same state as Greek -ou. So, /i/ resembles /u/ but not /e/. How do the authors come by these remarkable insights?
Normally, an overt suffix is counted as different from zero suffix. However, Latin feminine -a is assigned to the same state as French e, even though in Parisian French that orthographic -e is purely decorative, and the suffix is zero.
I could go on in this vein, but you get the idea. There is no rhyme or reason in the assignment of states, and the authors’ procedure is as capricious as it is unexplained.
At this point, the work under discussion abandons the discipline of linguistics altogether, and in fact it ceases to be anything recognizable as serious scholarship. Linguistics cannot be done in terms of subjective notions of similarity. This is the kind of sludge we see in those lurid articles claiming to have reconstructed “Proto-World”, and in those delightful Websites announcing “Latvian—the key to all languages”.

Via Mark Liberman of Language Log, who also links to a credulous review in American Scientist Online which he deconstructs himself. And I should add that the NY Times fell for this nonsense back in July, which is what alerted the LINGUIST List folks to it.


  1. Foster and Toth aren’t the only ones. Ringe, Warnow, and Taylor, also at Penn, have been working on redoing genetic trees with algorithms from genetics research. The problem I had with their research was that they never listed their characters, but F&T did. Whoa! When I read Trask’s posting to the Linguist List I had to chuckle at my worst fears confirmed.

  2. But I don’t understand. Mingrelian is the key to all languages! Not Latvian, which is merely a debased form of Lithuanian. I don’t know where people get these things.
    Surfing the net for names of languages and language groups does indeed take you to some unique places.

  3. Oh yeah. I remember reading that issue of LINGUIST and smirking. It was really vicious, and yet deserved.

  4. Jim, you need to be a little more careful: your short comment makes three important mistakes.
    First, Forster and Toth are not “also at Penn”, but at Cambridge University and the University of Zurich, respectively.
    Second, Ringe/Warnow/Taylor have certainly published the detailed bases of their analysis, both the linguistic characters used and the mathematical and computational issues in the algorithms, most recently in
    Don Ringe, Tandy Warnow & Ann Taylor. Indo-European and Computational Cladistics. Transactions of the Philological Society, 100 (1) pp 59-129 (March 2002), but also in other publications going back to 1995.
    Third, Forster and Toth use a completely different and unrelated set of features, as well as a different phylogeny-inference “algorithm” (one of the problems with their work is that they don’t really have an algorithm, so that their work is oddly empty from a mathematical as well as a linguistic point of view). While you might disagree with the Ringe et al. choices, Trask’s critique is not in any way relevant to their work, as their choices are very much in the style of a rock-ribbed traditional indo-europeanist, which is what Ringe is.

  5. He sure is; he studied with the great Warren Cowgill (little remembered because he never wrote a book).

  6. Mark– Well, I put my foot in that one. I’ll take a look at the paper you cited. The two I looked at said there were characteristics but vaguely categorized them but never really listed them. I realize that Ringe linguistic chops, but so did Morris Swadesh. Sorry about the misinformation. I’ll be mroe careful next time

  7. I know I should just cut my losses and keep quite, but my last comment borders on the unintelligible. Here’s what I meant to say:
    Mark– Well, I put my foot in that one. I’ll take a look at the paper you cited. The two I looked at mentioned characteristics and vaguely categorized them, but never really listed them. I realize that Ringe has linguistic chops, but so did Morris Swadesh. Sorry about the misinformation. I’ll try to be more careful next time.

  8. Steve Long says:

    Exactly the problem. The resulting IE “tree” is constructed entirely based on a rather standard and narrow PIE reconstruction and the outcome is completely expected. It is a picture of Ringe’s vision of how PIE should be reconstructed using a limited number of well-estabkished cognates. Where one is missing, it is marked “lost” — a circular assumption. There is no control to compare hypotheses about early borrowing or semantic inconsistency. You could use the same technique to construct Klingon family trees.

  9. Years ago on a histling list (now apparently defunct) it was quite entertaining to see Trask dissect those who claim similarities between Basque and Georgian, Berber or Nahuatl. Glad to see he hasn’t lost his form.

