(MAG)PIE.

November 22, 2011 by languagehat 89 Comments

The etymology of the word pie (in the edible sense) has been something of a mystery; the American Heritage Dictionary cautiously says “Middle English” and leaves it at that. Alison Richards, at NPR’s food blog, takes the occasion of the upcoming Thanksgiving holiday (in the U.S.) to cite the OED’s entry—and links to it in such a way that you can read it even if you’re not a subscriber, which is great. As she says, it “may well derive from the Latin word pica, meaning magpie”; here’s how the OED puts it (in the new third edition):

The dish, which originally consisted of any variety of ingredients, may have been named by association with the bird, either after the bird’s spotted appearance or after its tendency to collect miscellaneous articles. In this context, the similarity between the words haggis n. and haggess n., a name for the magpie, has been pointed out; compare also chewet n.1, a dish of mixed ingredients, and chewet n.2, a name for the chough.
For an alternative etymology < an unattested variant *pis of Anglo-Norman puz and Old French puis pit, well (Middle French puis, French puits; < classical Latin puteus: see pit n.1), and thus an assumption that sense 2 is in fact the original sense, see C. H. Livingston History and Etymology of English “Pie” (1959 ).
Compare post-classical Latin pia (1303, 1317 in British sources), which is perhaps < English. […]
Compare also post-classical Latin pica pie, pastry (c1310, 1419 in British sources; perhaps identified with classical Latin pīca magpie: see pie n.1) .

Ms. Richards expands entertainingly on those suggestions (and links to some other OED entries); she ends her essay: “So as you eat this year’s slice of pumpkin or apple pie, I hope you’ll enjoy the thought that each sweet mouthful of fruit and spice carries the memory of an ancient magpie treasure trove.” I add my own hope that everyone who celebrates the holiday Thursday gets through it without either heartburn or family drama. (Pro tip: Using a butterflied, or spatchcocked, turkey cuts down on cooking time and makes it easier to get all parts to the proper degree of doneness.)

Comments

brett says

November 22, 2011 at 10:06 pm

I would have thought for sure that pie was Proto Indo-European.
John Cowan says

November 22, 2011 at 11:01 pm

I see what you’re doing there.
In fact, inherited words in English in /p-/ are very rare, because they would reflect ancestors in *b-, that rarest of all PIE phonemes. Even path, which is in all the West Germanic languages, turns out to be a loan from Iranian, or just possibly Celtic, but definitely not original.
So rare is *b, indeed, it may not have existed at all. AHD4 lists only two roots. Bak- ‘staff’, giving bacillus, baguette, bail, bailey, baculiform, debacle, imbecile via Latin and bacterium via Latinized Greek. Bel- ‘strong’ gives us (with a prefix) debilitate, debility from Latin, and the o-grade Bolshevik from Russian.
The (minority) glottalic theory holds that what is conventionally written *b, *d, *g were actually ejectives, and the far more common *bh, *dh, *gh were normal voiced stops. In languages with ejectives, the labial ejective is often missing, and the constraint against two of *b, *d, *g appearing in a root makes more sense if they are ejectives: two switches of airstream mechanism in one syllable is hard to execute.
Brian C. says

November 23, 2011 at 6:09 am

Very cute, Brett.
zythophile says

November 23, 2011 at 9:47 am

And, of course, as well as “printer’s pie”, mentioned by Alison Richards, there’s pica (pronounced “pye-ker), the name formerly given to 12-point type, which as the OED says, may or may bot be connected to magpies …
AJP Crown says

November 23, 2011 at 12:45 pm

The OED had a bit of a cold.
Trond Engen says

November 23, 2011 at 2:34 pm

The OED had a bit of a cold.
Picachu!
Ø says

November 23, 2011 at 4:23 pm

Yes, we always order the bird spatchcocked. It’s so much fun to say when you go to pick it up at the butcher’s counter. Let me mention, in case anybody missed it, that there was a spatchcock thread.
marie-lucie says

November 23, 2011 at 7:16 pm

I did not know about the association of English pie (the dish) and (mag)pie (the bird), pie in the latter being a borrowing from une pie in French (the initial mag may have been added to differentiate the two words). From the explanations given above, this association seems plausible, even though the bird is not spotted but has large, distinct areas of black and white, like an orca or some types of cows. The dual colouring of the bird is also reflected in French pie applied to the appearance of a horse or cow, and English piebald and pied (piper).
The TLFI says that the root pic in Latin pica ‘magpie’ and picus ‘woodpecker’ is onomatopeic, deriving from the sound made by the bird. If so, the bird meaning would be the oldest, followed by the meaning ‘mixed colouring’ then (in English) ‘mixed food’.
A link with Latin puteus, French puits, etc is most improbable.
David Marjanović says

November 25, 2011 at 4:43 pm

In fact, inherited words in English in /p-/ are very rare, because they would reflect ancestors in *b-, that rarest of all PIE phonemes. Even path, which is in all the West Germanic languages, turns out to be a loan from Iranian, or just possibly Celtic, but definitely not original.

Consequently, Vennemann has pounced on most or all of the others and tried to explain them as loans from Phoenician or some earlier Semitic/Afroasiatic language. Plough is one of those.

AHD4 lists only two roots. Bak- ‘staff’

…is particularly interesting, because it contains *a, that other rarest of all PIE phonemes that is so rare some think it may not have existed at all.
(Of course, *ā is the third one.)

In languages with ejectives, the labial ejective is often missing

…though… I can’t my search results anymore, but I once googled for the Georgian letter for the labial ejective and found it to be more common than the other ejectives. To make sure that Google finds words with those letters and not just the isolated letters, I googled for e (the most common letter in several big languages that use the Latin alphabet) and got a number of results that must have included pretty much every page on the Internet. So, apparently, when the labial ejective is present in a language, it’s not automatically rare.

the constraint against two of *b, *d, *g appearing in a root makes more sense if they are ejectives

Yes, but there were several similar constraints in PIE, and there are languages with voice dissimilation – we’ve had that discussion; Kinyarwanda is one of them.

two switches of airstream mechanism in one syllable is hard to execute

I love the Haida word t’ap’at “to break”, though.

mag)pie (the bird), pie in the latter being a borrowing from une pie in French

(Just for comparison, German has Elster.)
marie-lucie says

November 25, 2011 at 7:51 pm

labial ejectives
I have been studying a group of Amerindian languages in which one set has the ejective [p’] while the other set has [m’] in the corresponding words ([p’] in the second set only occurs in borrowings from the first set). Various other correspondences make it clear that the proto-language must have had *[p’] not *[m’], so the second set must have switched from [p’] to [m’], or been acquired by a population which did not have the sound [p’] and replaced it with [m’]. It is not impossible that (pre-)PIE had *[p’] but that it became [m’] then [m] in the daughter languages, or (assuming that PIE was acquired by different populations) that the new learners did not have the sound [p’] and replaced it with what they felt was the closest sound in their language.
Another factor about the relative infrequency of [p’] versus other ejectives is that it involves simultaneous articulations at the extremities of the vocal tract, which are relatively difficult to control at the same time. Thus an original [p’] could lose either the glottalization and become [p] or [b], or the labial articulation and become just the glottal stop, so that [p’] in one language might correspond just to a glottal or vocalic initial. Alternately, [p’] might evolve into a kind of [b]. (These are just general observations, I am not a PIE specialist and cannot take a position for or against any of those possibilities in the actual cases).
the Haida word t’ap’at ‘to break’
The PIE constraint against two consonants of similar articulation in a row applies to roots, and the Haida word may be more complex than just a single root. Also, glottalization is often associated with a semantic element of subjective perception: t’ap’at could have an element of onomatopeia and the repeated glottalization (causing the “ejective”) could reflect the noise associated with breaking things (languages of the same area also tend to have different words for breaking things of different shapes: long, round, etc, just as English has words more precise than break, such as shatter, split, splinter, etc depending on the nature of the thing broken).
zythophile says

November 26, 2011 at 5:45 am

The OED had a bit of a cold.
No, but I’ve got fat fingers.
Etienne says

November 27, 2011 at 1:58 pm

Marie-Lucie: interesting idea! You’re not the first to entertain it: it has been claimed (by Beekes, I think) that Proto-Indo-European may have undergone a /b/ (whatever its exact phonetic nature) to /m/ sound change immediately before its expansion/break-up.
(Incidentally, would I be correct in guessing that the language family you are talking about above is Penutian?)
One argument in favor of this claim, it seems to me, is the “duality” of proto-Indo-European /m/: in most cases its reconstruction is straightforward, but there are a number of instances/indications of an
/m/ – /w/ alternation in Indo-European itself: compare Latin PRIMUS and its Slavic cognates (cf. Russian PERVIJ). Likewise, within Indo-European, the (verbal) first person plural ending with initial /m/ versus the first person dual ending with initial /w/.
I have sometimes wondered whether non-alternating /m/ goes back to /b/, with /m/ and its /w/ allophone/variant being the older /m/ phoneme in Indo-European.
Come to think of it, the glottalic theory would explain quite nicely why this new /m/ did not undergo this alternation: glottalized phonemes are much less liable to lenition/weakening than non-glottalized ones. So perhaps /p’/ weakened to /m’/ first, which was a separate phoneme from an original /m/, which itself was/had been, in some positions, weakened to /w/.
Subsequently /m’/ could lose its glottal stop and merge with the older /m/ phoneme: the distribution of /m/ versus /w/ would then become wholly opaque, and one dialect of Indo-European would generalize /m/ in a given word (PRIMUS), and another dialect /w/ in the same word (PERVIJ). Hmm.
Ah well, short of finding relatives of Indo-European, I doubt the above will ever be proved.
languagehat says

November 27, 2011 at 9:20 pm

What an interesting idea! Once again, I’m very glad I started this blog. It’s far more stimulating than grad school.
marie-lucie says

November 27, 2011 at 11:18 pm

Etienne, I am not up to the latest ideas about PIE, but I am glad to hear that p’ > b ~ m has been suggested.
would I be correct in guessing that the language family you are talking about above is Penutian?
It is one of the component families of Penutian (Tsimshianic). I think that Tsimshianic is on the order of Ibero-Romance, while Penutian is probably on the order of Indo-European.
Etienne says

November 28, 2011 at 1:11 am

Marie-Lucie: so in effect, you’re trying to reconstruct Proto-Tsimshianic, and on that basis you and others will reconstruct Proto-Penutian? Sounds like a very worthwhile project.
Would now be a good time to tell you that I think that I can prove that Proto-Penutian *N- and *M-“first/second person” are borrowed and not inherited elements?
Hat: glad you like my idea. And your blog indeed is so much better than grad school: it’s more like having good coffee and fresh donuts at a conference with some select collagues, throwing ideas around.
marie-lucie says

November 28, 2011 at 10:55 am

Etienne, agreed on all points. It will take more than just PTsim to reconstruct PPen, but it is a start. It is a pity that the concept of “Penutian” started in California, where the relevant languages have fairly simple phonology, while the phonological complexity generally increases as one goes North. There is work there for generations of historical linguists!. About N and M, I would not be surprised, especially since those pronouns also occur in other places, both in America and Polynesia, for instance, but I would be interested to hear about your proof.
Hat, I completely agree with Etienne. Throwing ideas around, though theoretically encouraged, is often unwelcome in a classroom. Also, here everybody shares ideas, contribute both questions and answers, and the range of experience and personalities is a lot wider than just that of a graduate classroom.
Trond Engen says

November 28, 2011 at 1:54 pm

And your blog indeed is so much better than grad school: it’s more like having good coffee and fresh donuts at a conference with some select collagues, throwing ideas around.
… between other guests shouting half-heard bits and pieces of ideas while throwing coffee and donuts around.
Etienne says

November 28, 2011 at 5:59 pm

Marie-Lucie: I wonder: Is a reconstruction of Proto-Tsimshianic required to reconstruct Proto-Penutian?
After all, Indo-European was reconstructed on the basis of (inter alia) Gothic. Scholars did not wait for a reconstruction of Proto-Germanic to be made available before using Gothic/Germanic data to tackle Indo-European. Likewise, Bloomfield’s 1946 reconstruction of Proto-Algonquian was based upon a mere four Algonquian languages.
New data did later lead to modifications, but without seriously invalidating the foundations of either reconstruction.
So: when I can order my copy of AN INTRODUCTION TO PROTO-PENUTIAN: PHONOLOGY, VOCABULARY AND GRAMMAR (By Marie-Lucie and associates)? A look at your reconstruction of Proto-Penutian personal pronouns/person-markers would be needed for me to present/publish my demonstration that they are borrowed.
On a more somber note, I wouldn’t count on the “next generation of historical linguists” to do any further work on the topic. Everything points to the current generation of historical linguists also being the last. I recently ran into a grad student in “Romance syntax” who thought common Romance syntax could only be explained as UG, since of course he had no idea that Romance languages have a common ancestor or that Romance languages have influenced one another (INSERT RANT HERE).
Finally, I’m not sure I understand what the problem is in the Penutian theory having first emerged in Calfornia, where the languages are phonologically simpler. Do you mean that the more Northern languages, with their richer consonant inventories, are more conservative, and hence more transparenty related?
Trond Engen: actually, following the other guests’ exchanges makes this blog doubly interesting. I suspect I have learned more here about (for example) Slavic/Eastern European languages, literature and history than I could ever have learned on my own.
marie-lucie says

November 28, 2011 at 7:40 pm

Etienne: I wonder: Is a reconstruction of Proto-Tsimshianic required to reconstruct Proto-Penutian?
In the present state of knowledge, yes. At least it provides a very good start, since Tsimshianic is overall quite conservative (demonstrably so) and reconstruction can be done in some depth. I have been collecting as much data as possible on the other Penutian languages (at least the ones credibly placed in that group – it is not a definitive grouping and you don’t even find it any more in some of the reference works).
when I can order my copy of AN INTRODUCTION TO PROTO-PENUTIAN:…
Don’t hold your breath! First I have to get most of the PTsim done before I can convince most others. But I hope to make enough of a start that others will be able to build on it.
I am not as pessimistic as you are about the future of histling. It is currently at a low point because it is not fashionable, but most people are much more interested in language history than in theoretical games (hence the popularity of the likes of Merritt Ruhlen), and I think there will be a swing of the pendulum in the future.
Do you mean that the more Northern languages, with their richer consonant inventories, are more conservative, and hence more transparenty related?
Yes. Not only that, but there are many cases where a single phoneme in a California language corresponds to a cluster (usually resulting from a reduced root) in some of the languages farther North. In Tsimshianic you sometimes find both the full root and the cluster. Of course, identifying the root in a lengthy word means that you need to have a good grip on the morphology, not just the phonology.
The lack of obvious correspondences between North and South languages has been one of the problems among linguists trying to compare Penutian languages, since they expect a one-to-one correspondence between single segments, and some of the correspondences are more complex, some of them quite unexpected. But if you look at PIE comparisons, there are also very unexpected correspondences, such as the initials in (I quote from memory) Latin humus, Russian zemya (or something quite similar), and h ~ z is not a correspondence one would find acceptable or even plausible at the beginning of an investigation since they are phonetically quite different.
(from your earlier post: *N and *M): these pronominal formants for “je” and “tu” are not present (singly or together) in all of the Penutian languages, there are several other bases for the same meanings, often shared with those of other language families.
I have learned more here about (for example) Slavic/Eastern European languages, literature and history than I could ever have learned on my own.
Me too, definitely!
languagehat says

November 28, 2011 at 7:48 pm

Everything points to the current generation of historical linguists also being the last. I recently ran into a grad student in “Romance syntax” who thought common Romance syntax could only be explained as UG, since of course he had no idea that Romance languages have a common ancestor or that Romance languages have influenced one another
I was on the point of shooting myself, but marie-lucie rescued me.
marie-lucie says

November 28, 2011 at 8:04 pm

I am glad you did not act on your impulse, LH. Think before you act! We all love you.
At the present time it is admittedly hard to advise students who want to study historical linguistics, since they will first have to go through a lot of required syntax courses, and there are historical linguists in some departments who would love to teach their specialty (or more of it) but have to teach other things within the prescribed programs. Things are not totally lost though, there are glimmers of hope! Programs do change, although slowly.
John Cowan says

November 29, 2011 at 2:18 pm

I too hope for the eventual resurrection of historical linguistics, perhaps outside the universities.
Etienne: Latin primus is a superlative that remained intact when almost all superlatives were levelled to use -issimus; compare English first, which is also a superlative of unusual form. Others in this group are citimus/proximus ‘nearest’ (cf. E next < nigh), extremus/extimus ‘outermost’, infimus/imus ‘lowest’, intimus ‘innermost’, post(r)emus ‘rearmost’, supremus/summus ‘highest’, ultimus ‘furthest/last’ (cf. E last < late), and the noun bruma ‘winter’ < brevis ‘short’. Now this superlative ending is, the last I heard, an Italo-Celtic innovation. If so, how can it have a Slavic cognate in /w/?
Etienne says

November 29, 2011 at 3:30 pm

Marie-Lucie: I’m afraid I cannot share your optimism on the future of Historical Linguistics within Academia. While it is indeed a field which interests students more than theoretical games do, theoretical games have a huge built-in advantage: they require little to no data/background knowledge.
And the Humanities in general seem to be moving away from direct engagement with data and more into groundless and baseless theorizing, with a strong anti-intellectualism pandering to certain socially approved types of prejudice.
As a consequence, much of the remaining interest in historical linguistics involves very distasteful identity politics, and when historical linguistics runs counter to what students want to believe/feel good about, then this interest quickly turns into hostility. I’ve seen that happen many times.
Finally, in an era of cutbacks, “research” done which does not involve fieldwork or archival research or anything to acquire/refine data will have a massive built-in advantage. It is telling that other data-heavy branches of linguistics, such as typology or dialectology, aren’t faring at all well either.
John Cowan: the problem with claiming that PRIMUS is an old comparative, with a specifically Italo-Celtic ending, is that you’d have to assume that the similarity not just to PERVIJ, but to forms such as Lithuanian PIRMAS (where a majority of the phonemes correspond regularly) is purely coincidental. I prefer to regard all three forms as cognate.
However, since I was looking for an example of the m/w alternation, Lithuanian PIRMAS/ Russian PERVIJ, ignoring the Latin form, would do just fine: I do not believe anyone has denied that those two forms are cognate.
Trond Engen says

November 29, 2011 at 4:01 pm

If we’re about to witness the resurrection here and now, count me in for a front row ticket. If that would include the reconstruction of the proto-language of a middle-sized New World language family and a reshuffle of the PIE consonant system explaining apparent morphokogical irregularities, I might even consider holding my coffee and donuts.
John Emerson says

November 29, 2011 at 7:39 pm

I recently ran into a grad student in “Romance syntax” who thought common Romance syntax could only be explained as UG, since of course he had no idea that Romance languages have a common ancestor or that Romance languages have influenced one another (INSERT RANT HERE)
Awhile back I saw a sophisticated statistical approach to historical linguistics which tried to redo the traditional, less-sophisticated work in the area. I didn’t follow it very far because it didn’t seem to recognize that Spanish and Portuguese are closely related. Wish I’d saved the ling.
David Marjanović says

November 29, 2011 at 8:34 pm

If historical linguistics dies out, the phylogeneticists among the biologists will reinvent it. Unfortunately they’ll do so pretty much from scratch. I think they’ve already begun.
I had no idea about that duality of PIE /m/. Are there more examples than that superlative suffix? Is PIE /m/ unusually common?

h ~ z is not a correspondence one would find acceptable or even plausible at the beginning of an investigation since they are phonetically quite different

What is more, their common ancestor, [gʲʱ], looks so absurd that lots of people have tried to explain its existence away (by arguing against the reconstruction of voiced aspirated plosives, palatalized velars, or both). Does it exist in any attested language at all? – And yet, as far as I can tell, all these attempts have failed.

However, since I was looking for an example of the m/w alternation, Lithuanian PIRMAS/ Russian PERVIJ, ignoring the Latin form, would do just fine: I do not believe anyone has denied that those two forms are cognate.

…but… …but… …that would require that the m/w or even m/v alternation remained active all the way into Proto-Balto-Slavic! Is that a parsimonious hypothesis?

Marie-Lucie: I wonder: Is a reconstruction of Proto-Tsimshianic required to reconstruct Proto-Penutian?
After all, Indo-European was reconstructed on the basis of (inter alia) Gothic. Scholars did not wait for a reconstruction of Proto-Germanic to be made available before using Gothic/Germanic data to tackle Indo-European. Likewise, Bloomfield’s 1946 reconstruction of Proto-Algonquian was based upon a mere four Algonquian languages.

It all depends on how narrowly you mean “required”. Reconstructing all intermediate nodes greatly increases the chance that you get it right. If you want to pick attested languages instead, you have to pick the “right” ones, and that can go wrong. The people who reconstructed PIE using Gothic more or less treated Gothic as Proto-Germanic because it was so much older than all other attested Germanic languages and had several features that fit their expectations of what Proto-Germanic and/or PIE must have looked like; in hindsight, Gothic really is quite similar to Proto-Germanic, but it was something of a gamble to assume so a priori, and it does have a couple of innovations that all other Germanic or indeed IE languages lack.
Biologists, I should mention, don’t reconstruct their phylogenetic trees top-to-bottom or indeed in any direction at all. Our algorithms reconstruct all nodes at once, as part of a single application of Ockham’s Razor.

If we’re about to witness the resurrection here and now, count me in for a front row ticket. If that would include the reconstruction of the proto-language of a middle-sized New World language family and a reshuffle of the PIE consonant system explaining apparent morphokogical irregularities, I might even consider holding my coffee and donuts.

Seconded!!!

Ah well, short of finding relatives of Indo-European, I doubt the above will ever be proved.

Well, then somebody should look for relatives of IE. Very few people have ever tried – and they’re all considered eccentrics or cranks, which at least some of them actually are.
In the words of just such a possible crank:

One final point: How “scientific”, how “rigorous” is our discipline when all of the strictest methodologies and some of the most brilliant minds can only solve the EASY problems. Families like Indo-European and Uralic are fairly transparent. Even educated laymen can see the relationships here — indeed this is how it all got started (Sir William Jones). Are our methodologies so flimsy and is our imagination so impoverished that everything falls apart once we reach an arbitrary threshold of 5,000 years B.C. (which just happens to coincide with the most commonly proposed date for Proto-Indo-European)? Where would we be if Biology, for example, were similarly constrained? We can (and must) do better.

http://listserv.linguistlist.org/cgi-bin/wa?A2=ind0010&L=nostratic&D=1&F=&S=&P=3768
Allan Bomhard, 7 Oct 2000
David Marjanović says

November 29, 2011 at 8:46 pm

more or less treated Gothic as Proto-Germanic

Brainstorming:
– August Schleicher’s reconstruction of PIE (1860s, right?) was scarily similar to Sanskrit, all the way to absence of /e o/ and use of *v for [w]. More recent reconstructions have progressively and steadily distanced themselves from Sanskrit.
– Transcriptions of reconstructions of Proto-Germanic still tend to look too much like the way Gothic was written. I’m talking about the use of *b, *d, *g for what must have been [β~v ð ɣ] in most, maybe almost all, environments. That makes Proto-Germanic look as if it had already performed part of the High German consonant shift!
John Emerson says

November 29, 2011 at 8:55 pm

How “scientific”, how “rigorous” is our discipline when all of the strictest methodologies and some of the most brilliant minds can only solve the EASY problems.
Heisenberg said “Quantum physics is easy, but turbulence is impossible”. And argument has been made that the sciences succeeded by finding relatively easy problems, and that the social science have been much less successful because the problems are harder.
Are our methodologies so flimsy and is our imagination so impoverished that everything falls apart once we reach an arbitrary threshold of 5,000 years B.C.(which just happens to coincide with the most commonly proposed date for Proto-Indo-European)?
Well, when you run out of data, things get harder. And an argument can also be made that for various reasons, in the same way that the weather is only predictable within a certain window*, maybe beyond a certain threshold language change can’t be known.
* Apparently, weather can be predicted a week or so into the future, and they’re working on long-term climate change, past a week or two we can only know what the general range of possibilities is for that time of the year.
marie-lucie says

November 29, 2011 at 10:53 pm

WHile I disagree with most of the methods and results of Nostraticists like Alan Bomhard, I agree with his quotations here. In my opinion, historical linguistics, which had been triumphant in the 19th century, took a dive after Saussure switched from language evolution (diachrony) to the structure of language states (synchrony), and instead of providing a counterbalance to the purely historical approach, synchrony quickly became dominant and still is, while in many cases (at least in the US) diachrony is tolerated like an eccentric old aunt. Actually, Saussure switched his focus because he felt that histling had been ruined by its own success, and scholars were looking at tiny bits of language through a microscope and losing sight of the larger picture. This focus on tiny little bits and pieces is seen for instance in the work of the late Yakov Malkiel, who was famous for finding relationships between obscure words in widely dispersed Indo-European languages: these were ingenious exercises in linguistic virtuosity, perhaps comparable to carving on a pencil lead or a grain of rice, or building a boat in a bottle, they were not the type of work that would advance the discipline.
I don’t mean to denigrate the current historical linguists, many of whom (like Etienne) are doing very good work, since there is so much that is still unknown or poorly explained even in the best-known families, but it is true that most of the people known to be trying to go beyond PIE (to find relatives of this proto-language) are cranks, and some of them unfortunately are foisting worthless stuff on the unsuspecting public (like the “Proto-World” people). The more legitimate practitioners who are interested in the topic often don’t dare to come out for fear that they will be labeled cranks too and ruin their professional reputation. Personally I have less as stake than many, especially now that I am retired, so I don’t risk losing my job over my heretical opinions, but I know that some people think I am a crank myself because I dare to say: “What if?”.
What kind of a scientific approach is one where you cannot propose and test a hypothesis because you have put yourself in the straightjacket of a very narrow, rigid set of methods which you are not allowed to modify regardless of the circumstances, so that you give up before even starting? The great pioneers (Grimm, Rask, Verner, etc) invented and refined their methods as they struggled with the data, and there is no reason why we cannot keep doing it, expanding rather than dismissing the methods.
For instance, with m ~ w ~ b above: many will say that these sounds cannot be acceptable as phonological correspondences because they are phonetically too different (even though they all qualify as labials). End of discussion, end of this research thread. But if you search through the IE languages for more examples of such correspondences, you can find a number of them, eg Latin dative plurals in -b-is, Russian dative plurals in -m- (these are just a few that I remember). These correspondences are not very common, but they are not limited to just a few forms (eg JC’s examples) and cannot simply be dismissed as irrelevant oddities.
In conclusion, I am not convinced that historical methods have reached their pinnacle with the reconstruction of PIE or Proto-Algonquian, and I strongly disagree with the all-or-nothing approach which says (in the approximate words of someone I don’t want to name) “a language relationship is either obvious, with a rich skein of phonological correspondences, or it is forever unknowable”. I think that there is plenty to discover yet, and that the methods worked out by our predecessors have not yielded all their potential, which we can try to fulfill if we don’t limit ourselves to a narrow-minded interpretation of those methods.
Trond Engen says

November 30, 2011 at 7:08 am

From the LinguistList Daily Summary I get the impression that a great deal of the activity in the field is in historical linguistics and related fields. Is it a point that what you call theorethical games can be done in closed circles while historical linguistics gains comparatively more from the free exchange of ideas (informal contributions that will never be acknowledged as formal co-authorship and thus never turn up on the University rankings)? If so, one should see a renaissance of HistLing with the internet.
(The description of the defeat of research-based linguistics to “theorethical games” sounds awfully close to that of evidence-based economics. It’s paradoxical that when a field struggles to achieve theorethical depth, it throws out what _was_ scientific in collecting real data. It’s hard to imagine physics getting anywhere without the fruitful combination of both.)
Trond Engen says

November 30, 2011 at 7:08 am

From the LinguistList Daily Summary I get the impression that a great deal of the activity in the field is in historical linguistics and related fields. Is it a point that what you call theorethical games can be done in closed circles while historical linguistics gains comparatively more from the free exchange of ideas (informal contributions that will never be acknowledged as formal co-authorship and thus never turn up on the University rankings)? If so, one should see a renaissance of HistLing with the internet.
(The description of the defeat of research-based linguistics to “theorethical games” sounds awfully close to that of evidence-based economics. It’s paradoxical that when a field struggles to achieve theorethical depth, it throws out what _was_ scientific in collecting real data. It’s hard to imagine physics getting anywhere without the fruitful combination of both.)
languagehat says

November 30, 2011 at 10:35 am

Well, when you run out of data, things get harder.
Exactly. While I’m sympathetic to marie-lucie’s open-mindedness and desire to see more innovative work, the problem is exacerbated by what appears to be the division of scholars (and people in general) into those who are inherently conservative and want things nailed down before they’ll accept them and those who are inherently adventurous and are willing to accept suggestive correlations in place of anything resembling proof because of their horror vacui. I’m definitely of the former persuasion, and while I realize I am therefore liable to overlook promising lines of development because they seem too speculative, that is far preferable (in my view) to the haring off after any and all hypotheses that might further the cause of Proto-World that is so common in the latter. I would feel better about attempts to push further back in time if they were carried on by people who were less willing to overlook serious problems with phonetics, semantics, and data in general. (Who cares if this alleged dialect form scribbled down by an untrained visitor to the Himalayas in 1879 has only a vague resemblance to the Tibetan form I want to link it with? It fits my hypothesis and I’m running with it!)
languagehat says

November 30, 2011 at 10:37 am

(Note: Above comment may contain elements of exaggeration and/or caricature. Use with caution.)
Etienne says

November 30, 2011 at 2:26 pm

David: PIRMAS/PERVIJ is not a problem for me, since I do not believe there ever existed a Proto-Balto-Slavic language. However, along with PRIMUS, there is Gothic FRUMA as a good non-Slavic cognate. But I can’t think, off-hand, of other good examples.
Marie-Lucie: actually, no need to go as far as Latin initial /h/ versus Russian /z/ if you’re looking for a sound correspondance that isn’t obvious: just compare French YEUX and Spanish OJOS: no common phoneme, yet they each go straight back to Late Latin */oklos/.
Trond: it would be nice if there were a renaissance of historical linguistics thanks to the internet. But one thing which must be stressed is that many scholars do not publish/circulate their data…except when attacking someone’s proposal, that is. Joseph Greenberg, in an article “In defense of Amerind”, quite properly complained about this.
I agree that it is difficult to maintain the right balance: being open-minded enough to evaluate different claims on the one hand, and being skeptical enough not to accept every claim on the other. Two things that make me mildly skeptical are: 1-The fact that most of the proposed “cognates” are superficially similar, despite the fact that cognates, even in closely related languages, can be quite dissimilar (cf. the Franco-Spanish cognate I gave above), and 2-The fact that the long-rangers seem to know little about the observed effects of language contact. I once saw a list of proposed “cognates” between two language families which, it was claimed, HAD to be related. EVERY English gloss of the various “cognates” was a loanword in English itself, which certainly suggested that a contact explanation might account for the “cognates” between those two language families…
However, I am merely skeptical of what HAS been proposed: like Marie-Lucie, I am convinced that there remains a great deal to be discovered. And, Marie-Lucie, if the establishment stands in the way of your forthcoming INTRODUCTION TO COMPARATIVE PENUTIAN, this blog seems the perfect place, not only for some scholarly feedback, but also to learn how to produce a scholarly SAMIZDAT.
David Marjanović says

November 30, 2011 at 4:11 pm

Heisenberg said “Quantum physics is easy, but turbulence is impossible”.

Turbulence, actually, is just a brute-force problem. You need to throw heavy supercomputers at it. The math is stupendously complex – but it is known. The difficulties are only practical.
Similarly, phylogenetics in biology becomes a brute-force problem once you have assembled your data matrix. Accordingly, the data matrices have been growing for decades. In 1950, Willi Hennig had to calculate the shortest trees by hand…

And argument has been made that the sciences succeeded by finding relatively easy problems, and that the social science have been much less successful because the problems are harder.

Phylogenetics is confronted with extremely similar problems in linguistics and in biology. To claim one is a social science while the other is not strikes me as very superficial. The effects of borrowing (extremely common in linguistics) aren’t all that different from those of convergence (extremely common in biology).

Well, when you run out of data, things get harder.

Absolutely. The trick is you run out of data gradually. Why should it be all-or-nothing so that IE is uncontestable but Nostratic is untestable?

While I’m sympathetic to marie-lucie’s open-mindedness and desire to see more innovative work, the problem is exacerbated by what appears to be the division of scholars (and people in general) into those who are inherently conservative and want things nailed down before they’ll accept them and those who are inherently adventurous and are willing to accept suggestive correlations in place of anything resembling proof because of their horror vacui.

As a biologist, this kind of absolutist thinking gives me a culture shock. What happened to science theory? What happened to approportioning one’s strength of belief to the strength of the evidence?
Why do historical linguists still use the word “proof”, like mathematicians, logicians, creationists and nobody else? What happened to “evidence”? What happened to science theory?

(Note: Above comment may contain elements of exaggeration and/or caricature. Use with caution.)

That’s not the problem. The problem is you’re overlooking the diversity. Bomhard and the Moscow School Nostraticists are very different phenomena from each other*, and both use very different (and, well, more) methods than the Proto-World people who don’t do much more than say “ooh, shiny, look at this!”.
*…and reconstruct mutually incomprehensible Proto-Nostratics, grmpf. OK, maybe I exaggerate; my point is that Bomhard assumes glottalic IE while the Moscow School doesn’t.

I do not believe there ever existed a Proto-Balto-Slavic language

Oh. What do you propose instead? What are the closest relatives of Baltic and Slavic?

EVERY English gloss of the various “cognates” was a loanword in English itself

Ha! Awesome!
marie-lucie says

November 30, 2011 at 6:44 pm

Etienne, yes, Fr YEUX among Sp OJOS and It OCCHI is not what one would expect! but h ~ z applies to a number of examples which are otherwise compatible, which is why it can be considered a valid correspondence in spite of the unusual phonetics.
LH: when I say What if? I mean that one should be able to entertain a hypothesis and look for evidence for and against it, instead of dismissing it out of hand “because it could be wrong”: if X “could be wrong”, then perhaps it “could be right”, so how do you decide one way or the other? what type of evidence could be supporting the hypothesis? what could lead you to dismiss it? then you explore the various options and see where they lead. You learn from negative as well as from positive results. Some of the options will only lead to dead ends, or be indecisive, but if the hypothesis (or at least parts of it) is valid, other options will be found to converge with already known points, adding support to the hypothesis and suggesting further avenues of research, some of which may be quite surprising. Sure it would be nice to get everything nailed down, but that is not possible with a promising but complex subject: look at the PIE controversies still remaining after two centuries of research, which do not mean that the PIE hypothesis should be rejected in toto! There is no harm in asking What if? as long as you do not take the question for an answer, and you work through the data systematically and impartially, without skewing them to fit the answer you would like to get.
What I object to are rigid, unreflective attitudes such as: “you can’t do X because Eminent Scholar Y said you should not do so”, which could be good advice for a beginner unaware of the potential pitfalls of doing X, but not necessarily for a person with many years of experience.
David Marjanović says

November 30, 2011 at 8:37 pm

On another thread, MOCKBA just informed me that the Slavic word for “apple”, /jabl/- followed by what looks like a diminutive suffix and then either -/a/ (feminine) or -/o/ (neuter) depending on the language, may be cognate with apple, the initial /j/ being due to the general Slavic aversion to initial /a/. If that is true, and if the vowel correspondences fit (I don’t know), we have an attestation of what would be PIE *abel- outside of Germanic and Celtic.
Containing /a/ and /b/ (OK, maybe we can explain the /a/ away by postulating *h2, but that leaves the /b/), this root would almost have to be a loan. But, according to something I once read somewhere, some workers connect it to the Greek and Latin mal- (long /a/ in Greek, I forgot about Latin, and I forgot if the Latin is a loan from Greek; by “Greek” I mean Doric, Attic/Ionic has eta). If that’s true (again, I can’t even tell if the vowels fit, and I won’t look it up at half past 2 at night), we have another *b ~ *m alternation…
On the other hand:

Latin dative[/ablative] plurals in -b-is

The traditional explanation is that this is cognate with English by. If so, the common ancestor had *bh, not *b.
David Marjanović says

November 30, 2011 at 8:41 pm

Obviously, Germanic could have borrowed apple from Celtic, like it did with so many other words. But Slavic would have had to borrow it from Germanic immediately afterwards, before Germanic underwent Grimm’s Law. Borrowing it directly from Celtic may have been geographically feasible at a similarly early time.
Etienne says

November 30, 2011 at 11:17 pm

Marie-Lucie: having recently presented at my very first Americanist conference and been told my theory was wrong because the alpha male of the field disliked it, all I can say to your plea against rigidly following any orthodoxy is: HEAR HEAR!!
David: I don’t see why Baltic should be especially closely related to any single other branch: it has commonalities with Slavic as well as with Germanic, certainly. I believe Eric P. Hamp thought he had found evidence that Baltic had a particularly close relationship with Albanian.
And as far as I know it is universally accepted that Latin -BUS is cognate with Homeric Greek Instrumental-marking -PHI(N) and Sanksrit Instrumental plural -BHIS and dative plural -BHYAS, all of which consistently point to */bh/. As for “Apple”, Beekes reconstructs *H2eb-ol as the source of the Germanic, Baltic and Slavic words, but says nothing about Latin or Greek forms, which suggests he regards them as unrelated. But I freely admit your attempts at connecting them are very attractive.
John Cowan says

December 1, 2011 at 12:30 am

David M: If historical linguistics dies out, the phylogeneticists among the biologists will reinvent it. Unfortunately they’ll do so pretty much from scratch. I think they’ve already begun.
Indeed they have, and that’s a problem. Phylogenetic tree reconstruction works in biology because of two sets of facts linguistics does not have: the overall relatedness of all life (we don’t know that all languages are related), and access to DNA. When biologists try to reconstruct trees, they do things like assuming that two and erku must be different states of the ‘two’ character because they don’t look similar. It is only the comparative method that can tell us that the ‘two’ character is useless in Indo-European because there is in fact only one phenotype for it. Ringe & Co. get this right, and so far they seem to be the only ones who do.
Marie-Lucie: I have heard it said said that Dixon controls everything to do with Australian languages: go against him, and your research effort is squeezed to death.
Etienne: the phylogenetic evidence isn’t that strong (or we would have figured out IE branching by traditional methods) but it’s fairly univocal: the traditional but contested Italo-Celtic and Balto-Slavic nodes were confirmed, as well as Indo-Iranian (which as far as I know has never been seriously objected to).
languagehat says

December 1, 2011 at 8:42 am

when I say What if? I mean that one should be able to entertain a hypothesis and look for evidence for and against it, instead of dismissing it out of hand “because it could be wrong”
Yes, I agree, but since I haven’t been involved with the field for decades and can’t independently judge most of the proposed relationships, all I have to go on is what I perceive to be the scholarly rigor of the person proposing them. If I get a sense that there is what I consider a proper attitude involved—an honest attempt to evaluate all evidence, whether it supports one’s idea or not, and a willingness to say “I have my doubts about this link, so it shouldn’t be used to support further hypotheses”—then I’m much more willing to go along with the proposed ideas. But my experience is that most of the people attracted to long-range hypotheses are not terribly interested in rigor and tend to seize on anything that looks promising and not ask too many questions. If you or Etienne (or, say, Don Ringe) propose a connection, I’m likely to take it very seriously because I know you’re careful about these things.
David Marjanović says

December 1, 2011 at 9:15 am

Phylogenetic tree reconstruction works in biology because of two sets of facts linguistics does not have: the overall relatedness of all life (we don’t know that all languages are related),

1) Isolated language( familie)s should, for instance, fall in very different places depending on what other languages are in the dataset.
2) The evidence for all languages being related is much weaker than the evidence that all organisms are related. But so what? Is there any evidence to the contrary? Monogenesis is the most parsimonious hypothesis. It should be our starting point.

and access to DNA.

LOL! You just overlooked practically all of paleontology! DNA makes it much easier to generate a data matrix (with morphology, that can be an entire PhD thesis), but that’s it. Compared to morphology it comes with its own set of advantages and disadvantages.

When biologists try to reconstruct trees, they do things like assuming that two and erku must be different states of the ‘two’ character because they don’t look similar.

If that’s a phonological character, they are two different states of the same character, because they’re homologous.
If it’s a lexical character, you’re wrong, too – whether two features could be homologous is a scientific question; lots of work has to go into it. Why would biologists ignore regular sound correspondences any more than they ignore the common pattern of bones in the paired extremities of vertebrates?
Really, the only differences between cladistics and the good old comparative method are that cladistics explicitly counts the assumptions each phylogenetic hypothesis ( = tree) requires, where “each” means “each of thousands to billions”.

or we would have figured out IE branching by traditional methods

How many people actually tried? I get the impression most IEists assume or used to assume that IE is a polytomy, with all major branches separating from each other at the same time. (Stark contrast to the Uralicists, who all seem to assume strict dichotomous branching.)
Thanks for the link to the CPHL Project. Free pdfs! Whee! 🙂
David Marjanović says

December 1, 2011 at 9:20 am

must be different states of the ‘two’ character

Something similar is wrong with the famous Nature paper – I’ll get to it, but I have to run now.
John Cowan says

December 1, 2011 at 1:53 pm

Sorry, David, big fat editing error on my part. I meant “When biologists try to reconstruct language trees”, they do things like this.
Why would biologists ignore regular sound correspondences any more than they ignore the common pattern of bones in the paired extremities of vertebrates?
Purely out of ignorance and intellectual imperialism. I wouldn’t be astonished to hear of Malayic as a sister group to Italic, on the basis of Malay dua ‘two’, though the same people would laugh to scorn any attempt to associate birds with flies because they both have only two wings. (Okay, I exaggerate, but not much.)
In my reference to ‘two’, I meant the lexical character. In fact it is particularly mentioned as a useless character in the CPHL papers because there has been no lexical replacement of it in their IE language sample (or anywhere else in IE that I know of).
As for the polytomous IE tree, that just reflects explicit lack of knowledge: “We don’t know how they branched, so we’ll draw this ten-branched tree. Or maybe we’ll separate Anatolian from the rest and put in Indo-Iranian as a node, since nobody can hate on us very much for those.” But yes, very many people have tried to provide dichotomous trees for IE, and no two of them have agreed, so the whole subject has been stuffed under the rug. Consequently, when someone like Ringe comes along with a better (but unfamiliar) methodology that looks (but isn’t) much like what some quacks have been doing, they too are quietly considered quacks and stuffed under the same rug.
This is what happens to a scientific discipline, I fear, when it’s not widely practiced: the old Turks control everything, and the only rewards are for toeing the line. Instead of pursuing new ideas openly as a young scientist, you have to wait till your retirement to publish (as Marie-Lucie has had to do). No one will listen to you then either, but at least they can’t take away your pension.
David Marjanović says

December 1, 2011 at 2:48 pm

I mean Atkinson & Gray (2003): Language-tree divergence times support the Anatolian theory of Indo-European origin, Nature 426, 435 – 439. That paper treats the presence or absence of each cognate set as a separate character, even when different cognate sets share a meaning. Such cases must be coded as a single character with more than 2 states. Failure to do so will not only inflate some support values of the tree, but also greatly inflate the results of the subsequent molecular-dating analysis. I don’t know why the reviewers failed to notice; probably, like most reviewers, they simply didn’t care about the supplementary information. <headdesk>

I meant “When biologists try to reconstruct language trees”, they do things like this.

That’s exactly how I understood it. Even Atkinson & Gray didn’t do that. Remember, they’re phylogeneticists, not pheneticists.

I wouldn’t be astonished to hear of Malayic as a sister group to Italic, on the basis of Malay dua ‘two’

That’s a single character. The very point of doing phylogenetics as a science is to get away from the old scenarios that were based on declaring one or a few characters “important” and deliberately ignoring all the rest. Data matrices for phylogenetic analysis tend to have hundreds or (if the data are molecular) thousands of characters.
If you add data, the signal will add up, while the noise – which is random – will cancel itself out. Just make sure you don’t have correlated characters in your dataset (Atkinson & Gray did).

In my reference to ‘two’, I meant the lexical character. In fact it is particularly mentioned as a useless character in the CPHL papers because there has been no lexical replacement of it in their IE language sample (or anywhere else in IE that I know of).

Agreed: presence/absence of this cognate set is useless – parsimony-uninformative – within IE.

As for the polytomous IE tree, that just reflects explicit lack of knowledge: “We don’t know how they branched, so we’ll draw this ten-branched tree. Or maybe we’ll separate Anatolian from the rest and put in Indo-Iranian as a node, since nobody can hate on us very much for those.”

I’ve often encountered statements to the effect that a particular root must have been present in PIE because it’s present in 3 or 4 of the 10 branches. I’ve rarely seen a mention of the possibility that such cases could be innovations of a branch that includes those 3 or 4 branches but not most or all others. That’s what I mean. The explicit lack of knowledge acquired a life of its own and turned into a vicious circle.

But yes, very many people have tried to provide dichotomous trees for IE, and no two of them have agreed, so the whole subject has been stuffed under the rug. Consequently, when someone like Ringe comes along with a better (but unfamiliar) methodology that looks (but isn’t) much like what some quacks have been doing, they too are quietly considered quacks and stuffed under the same rug.
This is what happens to a scientific discipline, I fear, when it’s not widely practiced: the old Turks control everything, and the only rewards are for toeing the line. Instead of pursuing new ideas openly as a young scientist, you have to wait till your retirement to publish (as Marie-Lucie has had to do). No one will listen to you then either, but at least they can’t take away your pension.

Quoted for truth.
Bathrobe says

December 1, 2011 at 10:07 pm

I took one little course in historical linguistics at university and didn’t give it much effort because I thought it was irrelevant and fuddy-duddy compared with GG. Now I find I can’t follow what Etienne and marie-lucie are talking about, and I regret having closed my mind.
The older I get, the more I realise how important history is to just about everything. Of course, the sad thing about history is that it’s easily taken over by bigots and cranks (and governments) who use it to justify their own positions. But to repeat: I very much regret not taking historical linguistics seriously when I had a chance to learn something from a very erudite lecturer.
marie-lucie says

December 1, 2011 at 11:23 pm

When I first expressed an interest in linguistics, I was told to read the textbook by Charles Hockett (this was around 1962). I zipped through chapters on phonetics, phonology, morphology and syntax (pre-Chomsky, the latter was still rudimentary), but drew a blank on historical linguistics, which looked extremely difficult. I had no idea it would become my main interest some time later.
I have not looked at the book in decades, but in hindsight I think that Hockett, not being a historical linguist, did not know how to present the major concepts and sample data in a manner intelligible to beginners, and packed the chapter with diagrams and charts of correspondences which made the topic forbidding to readers like me who were on their own. I find that a lot of linguistics textbooks still suffer from a similar problem, with historical chapters apparently completely different in their approach from the chapters on the core technical areas. Of course, there has to be some difference between the presentation of synchrony and diachrony, but the two should be aspects of the same thing, not totally different. I wonder if descriptive biology (if there is such a thing) is presented as totally different from evolutionary biology.
marie-lucie says

December 1, 2011 at 11:37 pm

DM: I’ve often encountered statements to the effect that a particular root must have been present in PIE because it’s present in 3 or 4 of the 10 branches.
I think that this type of statement is backward: reconstructionists should not assign a root to PIE if it is not present in at least 3 or 4 of the 10 branches, and these branches must not all be adjacent to each other but include some from both East and West, otherwise the resemblances could be used to later diffusion (from innovation or borrowing) in one particular area. This sort of misunderstanding stemming from ill-digested principles and inadequate training (eg a single one-semester course) is unfortunately widespread.
Etienne says

December 2, 2011 at 12:19 am

Marie-Lucie, Bathrobe: my first Historical Linguistics class was taught by a soon-to-retire professor who was such a ghastly teacher that many students who might have taken an interest in the field turned their backs on it.
In retrospect I understand him a little better: I imagine that someone who has seen his chosen field lose so much ground in his lifetime thought that by the time interest in the field would start again he would no longer be around to see it, so why bother teaching it decently? It makes a sad kind of sense, but compounds the problem (of the decline of historical linguistics) of course…
David, Marie-Lucie: it was Antoine Meillet who proposed (In his INTRODUCTION À L’ÉTUDE COMPARÉE DES LANGUES INDO-EUROPÉENNES, I believe) that a root had to be present in at least three non-contiguous branches of Indo-European for it to legitimately be reconstructed back to Proto-Indo-European. Importantly, the root couldn’t just be a “look-alike”: phonologically it had to conform to known correspondances of Indo-European phonemes in whatever branches it was found.
As for this principle being misunderstood…a certain “Africanist” actually quoted this “three-branch” rule of Meillet’s in order to prove that all African languages derive from Ancient Egyptian: if he found a vague similarity between a given Ancient Egyptian word/morpheme and its semantic equivalent (loosely defined) in more than three African languages, well, that proved his thesis.
David Marjanović says

December 2, 2011 at 6:58 am

I wonder if descriptive biology (if there is such a thing) is presented as totally different from evolutionary biology.

There is no such thing. “Everything is the way it is because it got that way”; “nothing in biology makes sense except in the light of evolution”. Not every paper that describes a new species or a new anatomical feature will go to great lengths to reconstruct the evolutionary history that is implied, but it will always imply it.
Can’t find a citation for the quotation that turns all “why” questions into “how” questions, but it’s by J. B. S. Haldane, famous for embryology and the “inordinate fondness of beetles” quote.
Dobzhansky, Th. 1973. Nothing in biology makes sense except in the light of evolution. The American Biology Teacher 35: 125–129.
Gould, G. C. & MacFadden, B. J. 2002. Gigantism, dwarfism [argh], and Cope’s rule: “Nothing in evolution makes sense without a phylogeny”. Journal of Vertebrate Paleontology 22 (supplement to issue 3 – abstracts of the 2002 meeting of the Society for Vertebrate Paleontology): 60A.
Gould, G. C. & MacFadden, B. J. 2004. Gigantism, dwarfism [argh again], and Cope’s rule: “Nothing in evolution makes sense without a phylogeny”. Bulletin of the American Museum of Natural History 285: 219–237.

Importantly, the root couldn’t just be a “look-alike”: phonologically it had to conform to known correspondances of Indo-European phonemes in whatever branches it was found.

Of course; maybe I should have mentioned this. What I mean is that if a root isn’t attested in Anatolian, you cannot reconstruct it as present in what is usually called PIE.
(…unless you accept some kind of Nostratic and find it elsewhere in Nostratic, with the regular sound correspondences that you accept. In that case, you’re allowed to propose that it was lost in Anatolian or that it was present and is simply not attested in what little material we have of Anatolian. Otherwise, that would be vain speculation – not wrong, but untestable.)
Similarly, and I find this very important, if a root – or the result of a sound shift, or a grammatical innovation, whatever – is present in Germanic, Greek, Baltic and Sanskrit but not elsewhere, you can’t even safely reconstruct it all the way back to the other proto-language that is sometimes called PIE, because if, for example, the admittedly strange-looking tree at the CPHL website is right, that root is an innovation of the non-Italo-Celtic branch until further notice. Indeed, such innovations are pretty much the only source of data we can have for reconstructing IE intrarelationships.
David Marjanović says

December 2, 2011 at 7:11 am

Descriptions of a new dinosaur very often contain a phylogenetic analysis to determine its position in the tree.
David Marjanović says

December 2, 2011 at 7:33 am

…What did use to happen a lot was that molecular biologists made grand pronouncements about evolution even though they had no idea of the fossil record or of macroscopic anatomy. But even this is improving. Quite impressive collaboration papers have been published recently.
marie-lucie says

December 2, 2011 at 8:37 am

m-l: these branches must not all be adjacent to each other but include some from both East and West, otherwise the resemblances could be used to later diffusion
oops, I meant could be due to later diffusion …
DM: “nothing in biology makes sense except in the light of evolution”
It used to be the same in much language study also, prior to the spectacular rise of synchrony.
John Cowan says

December 2, 2011 at 5:49 pm

Hat, don’t shut this thread down if you possibly can, even though it’s being spammed. Important conversation here.
David:
For the record, Haldane didn’t actually say the deliciously ironic “inordinate fondness”; some anonymous person improved his actual remarks, which were made on a variety of occasions. He did, however, mention both beetles and stars.
[I]f a root […] is present in Germanic, Greek, Baltic and Sanskrit but not elsewhere, you can’t even safely reconstruct it all the way back to the other proto-language that is sometimes called PIE, because […] that root is an innovation of the non-Italo-Celtic branch until further notice.
Well, logically there are three possibilities: that it is a non-Italo-Celtic innovation, that its loss is an Italo-Celtic innovation, or that its absence from Italo-Celtic is an accident of the data. Historical linguists generally assume the third case by convention unless there is definite evidence otherwise, given how fragmentary the record actually is. In particular, with the entire Anatolian branch extinct and the record so lacking, PIE roots that don’t appear in it are generally not demoted a level, though logically they should be. “The most rapidly changing language in Indo-European is PIE.”
Trond Engen says

December 2, 2011 at 7:19 pm

As an utter layman my impression is admittedly superficial, but I don’t share this feeling that there’s only one way to reconstruct PIE. A general cautiousness is always expressed, but it doesn’t play out exactly the same way in any two linguists. What I get is that every major player in the field has his* own preferred reconstructions, his own inferences from grammatical innovations and his own hypothesis for a family tree. They may all still be kings of their own academic hill and hostile to innovative newcomers, I wouldn’t know, but as long as they differ from one another there’s enough interesting nuance for someone like me to get an idea of what is known, what is inferred, and where the intriguing cracks are.
*) No ‘hers’ among the major players, unfortunately. That’s Penutian. But I do have some hope for Jóhanna Barðdal‘s work with grammar.
John Cowan says

December 2, 2011 at 10:33 pm

Oh, and dwarfism is right. Dwarves is an analogy, not a homology, though it’s an analogy that has caught on, thanks to Tolkien, for the fantasy race (but not for short humans, who are still called dwarfs).
John Cowan says

December 3, 2011 at 1:29 am

A comment from the Cluj thread by Joel of Far Outliers, also quoted for truth:
I spent a frustrating postdoc year in Romania in 1983-84 trying to get Romanian perspectives on the Balkan Sprachbund and substrate effects more generally. (My dissertation was on Sprachbund effects–including wholesale word order changes–among Austronesian languages in New Guinea, the only place you can find verb-final AN languages.)
My Romanian advisor was a timid and unhelpful Albanian specialist. (And he was disappointed that I was no Eric Hamp.) No one wanted to talk much about Slavic influence or, God forbid, Slavic substrates. In fact, some seemed to believe the Balkan Sprachbund idea was part of a German-inspired plot to justify taking over the whole Balkan Peninsula.
All research in historical linguistics seemed to have an irredentist agenda. The putative Dacian substrate was useful because it antedated any Hungarian presence in Transylvania. And the putative Illyro-Thracian substrate was useful because it antedated any Slavic presence in the Balkans. I remember wading through a book of some professor’s dry etymological articles, only to find conclusions that took care to note that the presence of Romanian place names in then-Yugoslavia (Mt. Durmitor was one such, IIRC) proved that Romanian speakers had been there first. Well, good for them. I don’t care.
It was a lousy year for research but a fun year for language-learning.
Bathrobe says

December 3, 2011 at 7:11 pm

an analogy that has caught on, thanks to Tolkien
Just wondering if this is really Tolkien’s influence. I’ve been saying ‘Snow White and the seven dwarves’ since well before I read Tolkien. Although this is obviously not an argument against Tolkien’s influence, it seems to me more likely that what Tolkien did was make the spelling ‘dwarves’ respectable. For instance, I also say ‘rooves’ but write ‘roofs’, because ‘rooves’ is frowned on orthographically.
David Marjanović says

December 3, 2011 at 7:14 pm

Oh, and dwarfism is right.

The term nanism exists; I don’t know why Gould & MacFadden didn’t use it, especially given the fact that they also use gigantism instead of the, alas, also existing term giantism.

Dwarves is an analogy, not a homology

I know. German Zwerg makes it pretty obvious that dwarf is a word with [x]-to-[f] shift which has had its spelling changed, unlike laugh.
One of the two shaft homonyms is another such case. German distinguishes Schaft “shaft of a tool, arrow, spear, tall boot…” from Schacht “deep vertical hole”.
The z is fascinating for another reason. The High German consonant shift turned dw into tw, and, in Middle High German, this tw was sent through the High German consonant shift again, yielding zw and complete absence of tw in native morphemes in the modern language.* I forgot where I read that, but it was probably linked to from this blog – we once discussed the etymology of dwarf and the manifold spellings of its ancestors (featuring yogh and such).
* And loans apparently got reinterpreted. I think Quark “cottage cheese” (a word used in much of Germany) and perhaps Quargel “milk-based abomination from somewhere in Austria” are from Slavic, such as Polish twaróg “cottage cheese”; if so, [tv] was reinterpreted as the [kv] that occurs in native words. That would be reminiscent of how the Bohemian royal dynasty of Luxembourg became Lucemburský, foreign [ks] being reinterpreted as native [ts].
David Marjanović says

December 3, 2011 at 7:25 pm

foreign [ks] being reinterpreted as native [ts]

For an example of the opposite direction but the same reasons, see et cetera in English. Though I suppose the frequent misspelling ect. contributes, even if only by way of a reinforcing circle.
John Cowan says

December 4, 2011 at 12:16 am

Not that dw is doing so well in Low Germanic either. In English, we have only dwarf, dwell, dwindle in the standard language. Dwell has undergone semantic shift from ‘lead astray’ > ‘deceive’ > ‘delay’ > ‘linger’ (as in dwell (up)on > ‘make a home’.
In Scots and the dialects, the OED adds dwa(l)m ‘swoon’, dwang ‘reinforcing timber between joists or struts’, dway ‘deadly nightshade’, dwile ‘floor-cloth, mop’, dwine ‘pine, waste away’ (of which dwindle was originally a sort of diminutive) as still current. Modern times have given us dwimmerlaik ‘thing of sorcery’ (Tolkien again, last appearing around 1400 as demerlayk, demorlayk in the sense ‘sorcery’) and dweeb ‘nerd’ of unknown origin, possibly connected with feeb.
In a rather smaller Dutch dictionary, I find dwaal ‘wander’, dwaas ‘fool(ish)’, dwalen ‘err’, dwang ‘coercion’ (cf. English dwang above), dwarrelen ‘swirl’, dwars ‘transverse’, dwaze ‘foolish’, dweep-, for which I have no gloss but seems to have to do with fanaticism, dweil ‘cloth’, dwerg ‘dwarf’, dwingen ‘compel’. A few more than in English (and with a great many more productive compounds) but surely not many.
David Marjanović says

April 12, 2018 at 5:15 am

Not Haldane. “Everything is the way it is because it got that way” is by D’Arcy W. Thompson in his huge, and hugely speculative (as it turned out 90 years later), book On Growth and Form (1917).
j. says

April 12, 2018 at 9:54 pm

Surely we can round that to an even century by now, or was there some particular related development around 2007?

On /dw/ in English, the proper names Dwayne and Dwight seem to be stable and well-known enough.
Y says

April 12, 2018 at 10:26 pm

Dwayne had its heyday in the 1960s and 1970s, followed briefly by Duwayne/Dewayne, arguably motivated by dw- avoidance.
David Marjanović says

April 13, 2018 at 5:35 am

was there some particular related development around 2007?

Yes, the science of development genetics really took off in the early 2000s.

Dwight is well known only because of Eisenhower. It’s really not common.

Duwayne/Dewayne

Also Duane.
Lars (the original one) says

April 13, 2018 at 5:57 am

Du dwars/dwalen/… — some of these represent original /tw-/; Danish has tværs and dvæle, for instance.

Modern German has quer = ‘crosswise’ which, pace 2011 David, seems to represent internal /tv-/ > /kv-/ instead of /tsv-/ as in Zwerg. And was borrowed as E queer. The things you learn.
John Cowan says

April 13, 2018 at 6:42 am

Du dwars/dwalen/… — some of these represent original /tw-/; Danish has tværs and dvæle, for instance.

English thwart shows that the word originally began with /θw/, which became /tw/ in North Germanic and /dw/ in West Germanic (except English/Scots).
juha says

April 13, 2018 at 6:54 am

Dwight is well known only because of Eisenhower. It’s really not common.

Dwight Bolinger.
Brett says

April 13, 2018 at 7:18 am

Dwight peaked in the 1950s at about 500 per million boys and has dropped off precipitously since then.
Lars (the original one) says

April 13, 2018 at 7:21 am

/θw/ — unvoiced, at least. But the doublet thwart and queer in English is something you couldn’t invent, only reality could come up with it!
languagehat says

April 13, 2018 at 7:25 am

Also Duane.

Monosyllabic, at least for me.
David Marjanović says

April 13, 2018 at 8:45 am

seems to represent […] /tv-/ > /kv-/ instead of /tsv-/ as in Zwerg

Correct, and that makes it a doublet with Zwerchfell “diaphragm” (between chest cavity and abdominal cavity). I wonder if MHG tw- was so rare that it was eliminated not by a regular sound change or two, but by each word, one after another, joining a more common lexical set until there weren’t any left. This seems to have happened or be happening to the CURE vowel in various Englishes, as recently discussed on LLog starting here.
John Cowan says

April 25, 2019 at 9:52 am

dweep-, for which I have no gloss but seems to have to do with fanaticism

Found it: dwepen v. ‘idolize, rave, be fanatical’. Alas, there is no etymological information in WNT, and Wikt doesn’t have the word, but WNT does show that zinneloos zijn ‘be senseless’ is the oldest definition available to them. The KNAP Etymologiebank (one of those Dutch-as-misspelled-English things, like januari and februari) says the origin is obscure, with the only known cognate East Frisian id. It also points to PIE dheubh- in its variant form dhuebh > doof, but I think that is what you call drawing a bow at a venture, missing the venture, and hitting the king (Rufus of England, specifically).
David Marjanović says

April 25, 2019 at 5:41 pm

Dutch-as-misspelled-English

In this case it’s perfectly spelled German. 🙂

dhuebh > doof

Given the High German cognate taub and the English deaf, shouldn’t we simply start from an o-grade *dhoubh-?
Trond Engen says

April 25, 2019 at 6:04 pm

Seems right. Norw. dauv, now mostly in the compound dauvhørt “hard of hearing”. In the meaning “deaf” it’s all but replaced by Dano-Norwegian døv.
PlasticPaddy says

April 26, 2019 at 9:05 am

Re dwine, this has a Dutch cognate embedded in verdwenen (to disappear). In German this appears to have been replaced (by unrelated verschwinden, verduften etc.).
David Marjanović says

April 26, 2019 at 6:13 pm

Yes. (Verschwinden is general, verduften is regional and more like “piss off”. Schwinden alone is “to diminish”; tuberculosis as “consumption” used to be Schwindsucht.)
Stu Clayton says

April 26, 2019 at 10:15 pm

Geschwind, Geschwindigkeit don’t plug in here, according to the details in Grimm. I thought maybe a leetle beet, but no.
David Marjanović says

July 12, 2019 at 4:56 am

The term nanism exists; I don’t know why Gould & MacFadden didn’t use it, especially given the fact that they also use gigantism instead of the, alas, also existing term giantism.

They were trying to distinguish phenomena in evolution from the pathologies called nanism and, strangely enough, giantism in English-speaking medicine.
David Marjanović says

July 6, 2022 at 6:30 pm

Schwinden alone is “to diminish”

“Dwindle” is what I was looking for.
John Cowan says

September 12, 2022 at 8:20 am

In Scots and the dialects

Under dw- the DSL has some great examples and some particularly perverse spelling oddities: dwmlawit ‘laid before the court’, anyone? It’s ON dómlagðe, preterite of dómleggja, but is usually spelled more rationally with dum-. But the only two words the DSL really adds, each with just a few examples, are dwerch and dworce, which will need no glosses for the careful readers of this page.

I marvel again at the doublet status of thwart/queer.
M says

September 12, 2022 at 11:59 am

@ David Marjanović “the Bohemian royal dynasty of Luxembourg became Lucemburský, foreign [ks] being reinterpreted as native [ts].”

Czech Lucemburský (where c represents /c/) and related words are likely to derive from Lëtzebuergesch (where tz represents /c/) and related words,

Luxembourgish and related words with x (representing /ks/) is likely to derive from Medieval Latin Luxemburgensis, which is derived from Lëtzebuergesch.

If so, the sound change occurred in Medieval Latin rather than Czech
January First-of-May says

September 12, 2022 at 7:54 pm

In the (very extensive, though still incomplete) Wordle list of five-letter words, the dw- section contains dwaal, dwale, dwalm, dwams, dwang, dwaum, dweeb, dwile, and dwine, which accounts for all the five-letter English words in John Cowan’s comment (dwaum is yet another alternate spelling of dwa(l)m).

The odd one out is dwaal “an absent-minded state”, apparently borrowed from Afrikaans, which makes it a direct descendant of Dutch dwaal “wander” from the same comment.

Wiktionary adds dweet “drunk tweet”, dwelf “dwarf-elf hybrid”, and dword “(programming) double word, usually = 4 bytes”, plus (outside of the five-letter category) dweomer, dwimmer “magic” (the root of dwimmerlaik, which itself is presumably too nonce for Wiktionary), a bunch of derived forms, compounds, abbreviations, and proper names, two playful misspellings of “dragon”, and the mineral “dwornikite”.
(In addition, there are apparently [at least] two famous people surnamed Dworkin, namely Ronald Dworkin the legal philosopher and Andrea Dworkin the radical feminist, whose supporters are apparently called Dworkinians and Dworkinites respectively.)
Brett says

September 12, 2022 at 9:21 pm

Dweomer, previously.

(Also discussed a bit in that thread was the fact that Lloyd Alexander had translated Satre, which came up again just now.)
John Cowan says

January 24, 2023 at 1:25 pm

There’s another example of /kv~tsv/ confusion in Wikt s.v. quartz < NHG Quarz, but in Middle East High German it took two forms, quarz and zwarc. The ultimate origin is West Slavic: Polish twardy ‘hard’, Czech tvrdý ‘id.’ < Proto-Slavic *tvьrdъ, so this shows /tv/ going down both paths within German and then the /tsv/ path being eliminated in favor of /kv/. H/t Indo-Europeanist student Daphne Preston-Kendall.
David Marjanović says

January 24, 2023 at 2:34 pm

I had no idea.

The final [s] doesn’t make sense, though.
John Cowan says

January 2, 2024 at 10:42 am

dword “(programming) double word,

That really doesn’t belong here except orthographically, as it is pronounced “DEE-word”. At least, I have never heard any other pronunciation. It’s mostly a Windows thing nowadays.
J Pystynen says

January 2, 2024 at 2:47 pm

On second reading of the thread: an interesting bird name that I was expecting might have been brough up here but wasn’t, is treepie for a group of Southeast Asian corvids (Crypsirininae). Perhaps this shows that while plain pie for Pica is obsolete in modern English, magpie may remain analyzable regardless just on phonotactic grounds? (Or then, as is entirely possible, the name might also derive from someone with awareness of Middle English or therearound.)
David Eddyshaw says

January 3, 2025 at 2:26 pm

For instance, with m ~ w ~ b above: many will say that these sounds cannot be acceptable as phonological correspondences because they are phonetically too different (even though they all qualify as labials)

Nõotre màawó “leaf” corresponds regularly in all respects to Gulmancema fàagū “leaf.” The proto-Oti-Volta initial was *v, which was regularly devoiced in Gurma, and which became *b in Nõotre in the first instance; the further shift *b -> m before a nasal vowel is regular in Nõotre too. (POV vowel nasalisation is lost in Gulmancema.)

(MAG)PIE.

Comments

Speak Your Mind

Archives

Search

Recent Posts

Recent Comments