How to Make a Linguistic Theory.

October 26, 2018 by languagehat 307 Comments

Via Bathrobe, who says “Apparently written by Ken Miner,” How to Make a Linguistic Theory, by Metalleus:

Assemble a judicious amount of grammar, preferably English grammar since you’re aiming at readers of English. (If you feel there might be a market for linguistic theories written in Cebuano, by all means, give it your best shot.) Be sure to include passive constructions, accusative-with-infinitive constructions, and constructions with front-shifting. Leave everything else to future research (don’t worry, you’ll never have to actually do it).

Set up two levels of linguistic representation; call them Level 1 and Level 2, or even better, Level Alpha and Level Beta. This is to divide your explicanda into two conceptual domains so you can let one explain the other. Leave these levels and all constructs supporting them undefined; these will be your Theoretical Primes. Define everything else, however, not only as rigorously as possible but using as many symbols from the predicate calculus as you can understand.

Be sure to leave undefined the notion mu. Now make mu a unit at both undefined levels. For each mu use ordinary English spelling, but in upper case letters on one level, and in lower case letters on the other. Use abbreviations with upper case; for example, ERG, PRO, +ITAL for ergative, pronominal, borrowed from Italian.

From this point on you need a graphics expert. Draw guitar strings (don’t call them that, of course) from units on one level to units on the other level. Count and classify the various arrangements of strings you need for the amount of grammar you began with; then pronounce all other logically possible arrangements of strings forbidden by Universal Constraints. Give each constraint a handy name, such as The Adjustable Bridge Constraint, or The Open-String Pull-Off Constraint. Always capitalize and use the with constraints.

At this point it will be proper, though not absolutely necessary, to bung in a bit of data from other languages. Since ultimately theories like yours can be constructed only by trained linguists who speak natively the languages they are examining, frankly, the Second Coming will be upon us well before you’ll really have to think seriously about other languages. […]

I’ll send you to the link for the exciting conclusion. I do love linguistics snark!

Comments

J.W. Brewer says

October 26, 2018 at 8:18 pm

Arguably more subtle than some of Miner’s other contributions to the field, such as http://specgram.com/CDoLP/06.miner.definitions.html ?

Miner apparently became an Esperantist at an early age, which strikes me as consistent with either having a very good sense of humor or having no sense of humor at all, excluding only the middle ground between those extremes.
January First-of-May says

October 26, 2018 at 9:23 pm

If you feel there might be a market for linguistic theories written in Cebuano, by all means, give it your best shot.

Ironically enough, Cebuano is currently the language with the second largest version of Wikipedia, as measured by article count (the first largest is still English, though only by less than 7%).

The culprit is one Lars Sverker Johansson, who wrote a program (known as Lsjbot) for making loads of Wikipedia articles, with contribution from his wife Smiley for the use of Cebuano in particular – it being her native language.
(The third largest version, for much the same reason, is the Swedish one – Swedish being of course the native language of Lars himself.)
languagehat says

October 26, 2018 at 9:31 pm

Fascinating!
SFReader says

October 26, 2018 at 11:56 pm

You know there are people who read Playboy for articles.

I am like that – I read linguistic books for examples from various exotic languages.

Literally – I don’t read anything else in them, since I wouldn’t understand anything anyway…
Stu Clayton says

October 27, 2018 at 1:14 am

Lingueclecticism.
AJP Crown says

October 27, 2018 at 3:38 am

I often read Swedish Wikipedia articles when there’s no Norwegian version. I’m usually trying to find the colloquial English name(s) for wildflowers in Norway.
languagehat says

October 27, 2018 at 8:25 am

I can save you time — we call them all “flowers.”
AJP Crown says

October 27, 2018 at 9:19 am

Another city boy. “Yellow things” is a popular local name.
JJM says

October 27, 2018 at 11:39 am

The unfathomable language of modern linguistics is what drove me away from it. That and two other tendencies: those in the field seem increasingly prescriptivist about their descriptivism (ha!) and there’s far too much politics infused in their discussions now (don’t get me started on sociolinguistics!).

Let me leave things right there with this quote from a linguist, none other than Dr John McWhorter:

“Another reason for the gulf between the public’s and linguists’ perceptions of what language is like and how it ought to be is the ever-increasing gulf between academia and the public it was intended to serve. This is unfortunately as typical of linguistics as it is of other fields. Whereas a century ago, even academic papers on linguistics were often accessible to the general reader, today’s linguistics is couched in a dense jargon impenetrable to anyone outside the field (or often, subfield), published in journals unknown outside of linguistics, and summarized in staggeringly expensive books sold only to university libraries […] Like modern literature professors, historians, philosophers, and anthropologists, modern linguists talk almost exclusively to themselves…”

Mario Pei – where are you now that we need you?
D.O. says

October 27, 2018 at 12:10 pm

Well, without commenting on the quality or politics of linguistics research (of which I know nothing), it is unreasonable to require the bleeding edge research in any advanced science to be directly accessible to non-experts. There are popularizes to serve the need. I think we discussed it someplace, should active researches write popular articles/books on the side or should professional popular science writers (like Martin Gardner) take the load. And maybe both.
John Cowan says

October 27, 2018 at 12:12 pm

We have some fine popularizers right here at this blog. In any case, some subfields of linguistics are far less impenetrable than DiaChom, like dialectology and typology.
David Marjanović says

October 27, 2018 at 12:25 pm

Lies! Lies upon lies! Constraints in Optimality Theory are named in large and small capitals, without spaces, boldface or italics.
bulbul says

October 27, 2018 at 12:33 pm

This Ken Miner sounds a lot like Paul Postal…

JJM,

the McWhorter quote is just another proof that McWhorter is full of shit.
First, the existence of a jargon impenetrable to outsiders is a fact of every specialized human endeavor. Would he raise the same objection to mathematics, medicine or even the culinary arts? The complaint about journals uknown outside of linguistics is just a corollary to this: guess what, human knowledge has grown and therefore become very specialized.
Second, a century ago, the field of linguistics barely existed; I will happily defer here to any expert on the subject, but back then, most of what we would call linguistics was lumped with philology or philosophy and, more importantly, many of the concepts we take for granted today did not even exist. Just to give you an example from my area of expertise, the information structure concepts like topic and focus were essentially invented by Vilém Mathesius in the early 1900s. He did not do so ex nihilo, since he based his theory on earlier works by Weil and von der Gabelentz (who both were more philosophers than linguists), but still.
Third, the notion of a “general reader” is bullshit. The kind of people who read academic papers were – and are – typically well educated and back then, such education invariably included the study of Greek and Latin and thus increased familiarity with the fundamentals of contemporary linguistic terminology. These days, even well-educated people are incapable of grasping such fundamenal concepts as the difference between orthography and sound, let alone understand more complex ones.* And I wager that even those people would find such works as Sweet’s 1890 A Primer of Phonetics full of impenetrable jargon.

The real problem with modern linguistics is that it is largely dominated by one particular paradigm (lampooned in the satirical piece that is the subject of this post) which is, not to put a too fine point on it, full of shit. Not all shit, mind you, just full of it, everywhere, from the terminology it uses which seeks to emulate that of mathematics and physics, to the very core of it, both of which CHANGE ALL THE FUCKING TIME, both in time, as well as between individual users. Hell, even Universal Grammar is now out the window. Funnily enough, here McWhorter is a little bit right, but directs his critique at the wrong people: it’s not all linguists who talk almost exclusively amongst themselves, far from it – we talk to a lot of people from lot of other fields. The problem is that generativists or ‘theoretical linguists’ or whatthefuckever they call themselves these days only talk to each other in a language only they understand and they are rarely interested in talking to linguists. And this, in turn, betrays the fact that unlike the rest of us, they are only interested in their own theories, not language(s) as such.

* I am not complaining about how education is not what it used to be and kids these days are dumb af, just pointing the difference in the curricula of schools back then and now.
languagehat says

October 27, 2018 at 12:49 pm

I stand with my co-curmudgeon bulbul.
bulbul says

October 27, 2018 at 1:11 pm

I am actually slightly less curmudgeonly these days, but McWhorter really irks me. And this particular example is quite egregious, because a) it is from a book that actually makes a good and very important point and b) the passage contains this bit (omitted by JJM):

Gone are the days when linguists and other academics regularly offered the public books like Mario Pei’s The Story of Language and his many others, Robert A. Hall’s Leave Your Language Alone, Frederick Bodmer’s The Loom of Language, and Charlton Laird’s The Miracle of Language in lucid and engaging prose for people to read on the train or before bed.

The book was published in 2001, so, say, 2 years after Pinker’s “Words and Rules”, 7 years after Pinker’s “The Language Instinct”, 3 years after “Language Myths”, 6 years after ” The Cambridge encyclopedia of the English language”, 21 years after Lakoff’s “Metaphors We Live By”, 15 years after Lakoff’s “Women, Fire, and Dangerous Things” and a host of other books on linguistics aimed at the, um, general reader. So McWhorter is either ignorant or – more likely – bullshitting in the traditional huckster manner (“There is nothing that even comes close to this book, buy it, recommend it to everyone!”). In either case, ugh.
Etienne says

October 27, 2018 at 3:33 pm

Bulbul, Hat-

I partly (dis)agree with you both.

The real problem with modern linguistics, to my mind, is NOT “that it is largely dominated by one particular paradigm”: its real problem is that a majority of so-called linguists no longer care about languages at all: all they care about is some paradigm or other (typically the one which was the “theoretical basis” for their dissertation); actual languages (to the extent to any are at all relevant) are simply selectively mined as sources of (occasionally accurate) data to shore up said paradigm.

A history of twentieth-century linguistics, I think, would need to trace the transition between the early twentieth century, when linguistic descriptivism reigned supreme and theories were simply tools to be used for the purpose of describing languages, and the late twentieth century, when theory reigned supreme and linguistic data are simply irrelevant unless they are of use to some theoretician.

Thus, while I agree with Bulbul that part of the reason why linguistics is more opaque to the outsider than was the case a century ago is indeed because our knowledge has grown a great deal, I think that another part of the story relates to the ever-growing irrelevance of linguistic data to linguists today: it is more difficult to describe linguistics to outsiders than before because mainstream linguistics is basically theory-driven now, and thus is less understandable to outsiders (who naturally know nothing about different theoretical fads –err, I mean schools– within the field) than more data-driven subfields, such as historical linguistics, dialectology or typology.

And this seems typical across the humanities: I have lost track of the number of historians or archeologists I have spoken to who quite simply did not have the knowledge/data I was looking for, but who were VERY conscious of the various theoretical currents within their field. Perhaps as an outsider I missed some nuance(s) or other, but the theoretical disputes in these other fields likewise seemed unrelated to anything so hopelessly…GAUCHE as actual, verifiable…facts (gasp! That word again!).
languagehat says

October 27, 2018 at 3:57 pm

You’re not disagreeing with me at all (I can’t speak for Herr Doktor bulbul) — I absolutely agree about the overemphasis on theory and undervaluing of facts these days. Of course, theorizing is easy and fun, whereas facts are stubborn things.
SFReader says

October 27, 2018 at 4:03 pm

I often wondered what the geographers do nowadays after everything on Earth has been discovered.

Perhaps they just invent and debate new theories.

What else is left for them to do anyway?
David Eddyshaw says

October 27, 2018 at 4:07 pm

You know there are people who read Playboy for articles.
I am like that – I read linguistic books for examples from various exotic languages.

SFReader has nailed the problem: the bits of theoretical linguistics which float free from actual language facts are basically pornography.

I wonder if anyone has yet done the neurophysiological studies proving that the same parts of the brain are active in engagement with both? (Perhaps they have, and the results have been suppressed …)

[I note that the link bulbul gives to a discussion about NC not actually believing in UG any more, references an interview with the linguist Jessica Coon in … Playboy.]
bulbul says

October 27, 2018 at 5:53 pm

Etienne,

I think that another part of the story relates to the ever-growing irrelevance of linguistic data to linguists today
There’s a bunch of problems with this statement, starting with the definition of “linguists” and “today”. The thing is, it’s the generativists who abhorred empiricism from the outset, relying on pure introspection (as to why and wherefore, that’s another discussion). But there have always been those who collected data and even let their theory be informed by this – the late Ken Hale and his study of Warlpiri is one early example and as Martin Haspelmath notes, many generativists actually do descriptive work, except they always subordinate it to their theory. And recently, they have even taken to using corpus data, although mainly for diachronic work. There are those to whom data, or indeed languages as such are irrelevant, but there the question is whether they even are linguists rather then… something else. A comparison has been made here to the distinction between theoretical physics (= generativists) and other types of physics. I find such comparisons dubious, but then again, in light of the recent develoments in theoretical physics, it might even be apt, in that both are full of shit.
So no, I don’t think your statement regarding “ever-growing irrelevance of linguistic data to linguists” is true, in fact with the rise in computational linguistics et al, the converse may even be true. Of course, this discussion is meaningless unless we come up with hard numbers. So what do you want to measure and how? Hopefully not Lingbuzz, because that is a cesspool of generativist ramblings. (Full disclosure: there is a paper there by me and a bunch of other papers by people I respect and like)

it is more difficult to describe linguistics to outsiders than before because mainstream linguistics is basically theory-driven now,
Again, what do you mean by “mainstream linguistics”? You seem to juxtapose “mainstream linguistics” with “more data-driven subfields”. This sort of framing reminds me of how generativists equate their subfield – the so-called “theoretical linguistics”, pardon me while I barf – with the field as a whole. Please don’t make the same mistake. Look at the catalogue of Language Science Press, for example, which books there would fall into the mainstream and which are outside of it?
And as for the difficulty of describing linguistics to outsiders, again, it depends on the outsider and what type of linguistics are we talking about. Phonetics, phonology, historical linguistics and dialectology are quite easy for your average person to grasp, if you come up with a few examples. Morphology can be a little more difficult, syntax is, admittedly, tricky, but that also depends on how you describe it. Computational linguistics is also quite easy, except people ALWAYS bring up Google Translate, and discourse analysis and speech acts can be fun to explain to lawyers, as I recently found out. Of course, dumb motherfuckers be dumb motherfuckers.
John Cowan says

October 27, 2018 at 7:19 pm

Geography is what We Computer Programmers call a cross-cutting concern: it covers everything from planetary science to ecology to oceanography (a field in which by no means has everything been discovered) to meteorology to economics to demography to religion to transportation to urban planning to sociology when viewed spread out in space. In the same way, history is the cross-cutting concern that views things when spread out in time: everything that varies with time has a history, and everything that varies with location “has” (though the idiom is not usual) a geography, and indeed there are subtypes of geographers who study all the things I have mentioned above (which come from the WP article) and many more.
AntC says

October 27, 2018 at 7:46 pm

geographers … What else is left for them to do anyway?

https://xkcd.com/2061/

https://xkcd.com/2058/
David Marjanović says

October 27, 2018 at 8:57 pm

Yay, Hossenfelder! 🙂

I often wondered what the geographers do nowadays after everything on Earth has been discovered.

Mostly, geography has moved over to the humanities and studies why people in different places live in (however subtly) different ways.

…which is why it was a rather silly decision by the University of Vienna to move it into the geology building lo these 20 years ago.
David Marjanović says

October 27, 2018 at 9:02 pm

2061: 840 ppm CO₂ is scary. We haven’t had that much since the Paleocene-Eocene Thermal Maximum, which is exactly what it sounds like and happened 55 million years ago.

2058: Bits of mantle sometimes end up on the surface when a midocean ridge meets a subduction zone. Some kinds of lava come from very deep down. The rest is seismics.
Bathrobe says

October 27, 2018 at 9:03 pm

In Open-Mindedness, a defender of GG makes this point:

This program now stretches over about 60 years. In that time GGs have made many many discoveries about the structure of particular NLs (e.g. Irish has complementizers that signal whether a wh has moved from its domain, English lowers affixes onto verbs while French raises verbs to affixes, Chinese leaves their whs in place while English moves them to the front of the clasue, etc.)

I was flabbergasted. I don’t know anything about Irish, but number 2 just seems weird, an artifact of the theory not of language. But here it is: The Verb Raising Parameter. It still doesn’t make any sense.

As for Chinese, all I can say is duh!
languagehat says

October 27, 2018 at 9:11 pm

The Ptolemaic system now stretches over many centuries. In that time astronomers have made many many discoveries about deferents and epicycles. Let me tell you the latest about the equant point!
mollymooly says

October 27, 2018 at 9:44 pm

Geographers were interdisciplinary before it was cool
David Marjanović says

October 27, 2018 at 9:54 pm

The Verb Raising Parameter.

tl;dr

I tried, though, and gained the impression that “English lowers affixes onto verbs while French raises verbs to affixes” is a confusion of synchrony with diachrony.

As for Chinese, all I can say is duh!

Yeah, that one is rather glaringly obvious to any theory-free learner. “Discovery” is not the right term; GG didn’t even columbus it.
Bathrobe says

October 27, 2018 at 10:32 pm

Substitute “theoretical linguist” and “field linguist” for “geographer” and “explorer” respectively, and this little exchange seems eerily apt for modern linguistics:

“I am a geographer,” the old gentleman said to him.

“What is a geographer?” asked the little prince.

“A geographer is a scholar who knows the location of all the seas, rivers, towns, mountains, and deserts.”

“That is very interesting,” said the little prince. “Here at last is a man who has a real profession!” And he cast a look around him at the planet of the geographer. It was the most magnificent and stately planet he had ever seen.

“Your planet is very beautiful,” he said. “Has it any oceans?”

“I couldn’t tell you,” said the geographer.

“Ah!” The little prince was disappointed. “Has it any mountains?”

“I couldn’t tell you,” said the geographer.

“And towns, and rivers, and deserts?”

“I couldn’t tell you that, either.”

“But you are a geographer!”

“Exactly,” the geographer said. “But I am not an explorer. I haven’t a single explorer on my planet. It is not the geographer who goes out to count the towns, the rivers, the mountains, the seas, the oceans, and the deserts. The geographer is much too important to go loafing about. He does not leave his desk. But he receives explorers in his study. He asks them questions, and he notes down what they recall of their travels.”
Norvin says

October 27, 2018 at 11:07 pm

“English lowers affixes to verbs, while French raises verbs to affixes” can be thought of as an observation about the distribution of adverbs in the two languages; looking just at sentences without auxiliaries in them, English allows you to put adverbs between the subject and the verb, but not between the verb and the object, while in French it’s the other way around (so you can say “John often speaks French” but not “John speaks often French”, unless you are speaking French).
Bathrobe says

October 27, 2018 at 11:34 pm

Perhaps this is more relevant:

https://gawron.sdsu.edu/syntax/course_core/new_slides/9.1-Headmovment.pdf

I am still trying to figure out whether playing with tree diagrams makes explanatory sense. Or is it just playing round with tree diagrams…
David Eddyshaw says

October 27, 2018 at 11:54 pm

In fairness to the article linked by David M on the “Verb Raising Parameter”, the footnotes are pretty candid; while this was presumably not the intention, they actually reveal pretty clearly the extent to which the main article depends on cherry-picking its data (e.g. “For some reason, negation cannot participate in negative inversion in Danish, perhaps because it cannot bear prosodic stress.” “We do not consider verb-final languages like German or Dutch.”)
John Cowan says

October 28, 2018 at 1:10 am

In any case, adverbs can and do appear between a verb and its object in English. Here’s the New York Times quoting Tony Blair, certainly a native speaker, on the police killing of a Brazilian electrician mistaken for a terrorist:

“We are all desperately sorry for the death of an innocent person and I understand entirely the feelings of the young man’s family, but we also have to understand the police are doing their job in very, very difficult circumstances,” Blair said.

I have found a number of other examples of understand entirely + NP; many of them are from non-native speakers, but some like this one are not.
ktschwarz says

October 28, 2018 at 1:17 am

What the geographers do nowadays after everything on Earth has been discovered: Well, things change and new things appear: new cities, highways, river channels, seamounts.

Likewise, when geodesists agreed on the best-fit ellipsoid for the Earth as a whole in 1984, it wasn’t the end of geodesy. (The slight difference between WGS84 and previous datums is why your phone reports longitude 0°0’5″ West at the Greenwich Observatory, instead of zero.) Since then they’ve been watching the Earth’s crust move, since structures on land and continental shelf need coordinates in datums tied to their continental plates. Australia, the fastest-moving continent at about 7cm/year, has already defined a new datum to replace the one it defined in 1994. The North American Datum of 1983 is due for replacement in 2022.

(Linguistic note: “datums” is the plural of geodetic “datum”.)
Norvin says

October 28, 2018 at 1:45 am

John Cowan–right, I was oversimplifying. Examples like the one you’ve found are referred to in the literature as ‘heavy NP shift’, and seem to be possible only with sufficiently large objects: you can say “I understand entirely the feelings of the young man’s family”, but not “I understand entirely them”–and for me, at least, even something like “I understand entirely their feelings” is pretty bad. Nothing similar happening in French, where you can say the equivalent of ‘I speak often French’, which is not natural in English.
Norvin says

October 28, 2018 at 1:49 am

David Eddyshaw: I think it’s unreasonable to regard “We do not consider verb-final languages like German or Dutch” as cherry-picking, if you look at it in context. What they’re doing is discussing a claim that verbs move to structurally higher positions in the sentence. In languages in which the verb is not final, this will have the consequence, sometimes, of moving verbs past adverbs, and so you’ll get to see the effects of the movement. In languages in which the verb is final, the movement wouldn’t have any effect on the word order, and would be difficult to detect. They’re just declaring their intention to concentrate on the cases where the facts are easy to figure out, and to leave the difficult cases aside for now.
bulbul says

October 28, 2018 at 3:34 am

Norvin,

you say:
I think it’s unreasonable to regard “We do not consider verb-final languages like German or Dutch” as cherry-picking, if you look at it in context.

And then like six lines later:
They’re just declaring their intention to concentrate on the cases where the facts are easy to figure out, and to leave the difficult cases aside for now.

So if I understand your argument correctly, they take what is easily accessible and ignore the rest for now, kinda like one would take, oh I don’t know, the cherries from the top of the cake?

“English lowers affixes to verbs, while French raises verbs to affixes” can be thought of as an observation about the distribution of adverbs in the two languages;
Well, sure. But since when is the generative enterprise concerned with statistics?
bulbul says

October 28, 2018 at 3:55 am

Bathrobe reminded me of this list of purported achievements of generative syntax. To be fair, some of them are quite interesting, although – as pointed out above – they don’t even qualify as discoveries even in the Columbian sense. E.g.

Grammatical Subject [There is a distinction between grammatical subject and thematically highest
argument (though traditional subject diagnostics may decompose even further)]: Chomsky (1965)

This is something Vilém Mathesius described in 1939, except he did so in Czech, so, you know, doesn’t count.

Others might make some sense if one were willing to wade through the jargon, e.g. “Principle C [an R-expression can’t be bound by (systematically corefer with) a c-commanding pronoun]”; I, for one, have better things to do with my life and I also have a suspicion that if they really thought about it, they would end up with just straight up dependency syntax.

The worst part, however, is the all the hedging bullshit. Remember, these are people who wanted to put linguistics on the same footing as mathematics (Chomsky 1965), people who set out to search for Principles and Parameters. And what do they have to show for it? Bullshit like this (emphasis and square bracket comments mine):

– All SOV languages allow a degree of word order freedom (scrambling) [Well, how much and under what conditions?]
– Unbounded dependencies preserve case, agreement, and binding configurations, and do not (normally) feed A-positions [So when do they?]
– It is relatively difficult to embed head-final projections in head-initial ones… [What does ‘difficult’ even mean here, under what specific conditions?]
– High (preverbal ) subjects are more difficult to extract than low (often postverbal) subjects in a class of cases. [See above]
bulbul says

October 28, 2018 at 4:27 am

John,

In any case, adverbs can and do appear between a verb and its object in English.
Oh don’t be silly, it’s quite obvious this is just some performance error.
Bathrobe says

October 28, 2018 at 4:31 am

verbs move to structurally higher positions in the sentence

This only makes any sense if you (1) accept tree diagrams as the ideal model for representing human language and (2) accept a certain vintage of Chomskyan theory as the ideal model for tree diagrams. If either of these is shown to be false, the whole enterprise collapses.

Since Norvin seems to be pretty familiar with GG and trees, I would be interested to hear his views (with supporting evidence) on the suitability of trees (or phrase markers) for representing human language and, in particular, the suitability of this particular model of tree diagram. And please, no ‘heavy NP shifts’. That is precisely what is being parodied in the linked piece. The people here are not neophytes eager to embrace the latest theory; they are people who know a lot about language but are sceptical of theories and constructs that do not seem to be grounded in actual language.
AJP Crown says

October 28, 2018 at 5:13 am

The Loom of Language
Now we have a title, someone can begin writing a biography of our host. But in English; verb-final languages like German or Dutch we do not consider.
January First-of-May says

October 28, 2018 at 8:22 am

Examples like the one you’ve found are referred to in the literature as ‘heavy NP shift’, and seem to be possible only with sufficiently large objects

Presumably Roman Jakobson was being abnormally free with English word order (possibly due to contamination from his own native Russian?) when he uttered his famous statement I do respect very much the elephant…
David Eddyshaw says

October 28, 2018 at 9:02 am

Norvin is basically quite right about heavy shift as a thing. It’s by no means confined to English, come to that. I don’t know how the GG people fit it into their system, though Norvin may well do.

I do respect very much the elephant does strike me as unlikely to be the utterance of a native speaker, at least in the context of Jakobson’s original remark; it would be OK if it went I do respect very much the elephant which realises that it is not qualified to be a professor of zoology just because it’s an animal, which would be heavy shift, of course.

I would also concede that Norvin has a point re “verb-final languages like German or Dutch” at least in principle; it is indeed reasonable scientific practice to start with the presumably simple. But this awarding of yourself a raincheck means that you do have an obligation to then go on to find out whether your principles deduced from simple (not to say “toy”) cases are actually robust in the face of more complex data. Has this been done? (Not a rhetorical question – I’d like to know.)

Having played nice, I feel entitled to snark: the verb-raising article says:

We know of no rich agreement languages with tense lowering in the morphology.

To which the immediate reply has to be “exactly how many languages have you looked at? and from how many historically unrelated families? and have you actively looked for counterexamples?” “We know of no ….” just doesn’t cut it.
bulbul says

October 28, 2018 at 9:10 am

Just of out curiosity, I picked up some of the books McWhorter speaks so highly of and oh boy. “The Loom of Language”, for example, contains a chapter titled “The Diseases of Language” which contains this passage (p. 414 of the 1949 edition):

The earliest recorded form of Slavonic is Old Bulgarian, into which two Greek missionaries, Kyrillos and Methodos, both from Salonika, translated the Gospels in the middle of the ninth century. This Bible language, also called Church Slavonic, became the official language of the Greek Orthodox Church. It still is. Since the art of writing was then the exclusive privilege of the priest-scribe class, Church Slavonic also became the secular medium of literature. … As a hangover from their church-ridden past, citizens of the U.S.S.R. still stick to “Kyrilliza”, a modified form of the Greek alphabet … once current in Byzantium. The Poles and Slovaks – but not the Serbs or Bulgarians – are free from this cultural handicap.

And then later on the same page:

Meanwhile in Russia, as elsewhere, Slavonic languages constitute a fossil group from the grammatical standpoint. They preserve archaic traits matched only by those of the Baltic group. Noun-flexion, always a reliable index of linguistic progress, is not the least of these. Slavonic languages carry on a case system as complicated as that of Latin or Greek, Bulgarian alone has freed itself from this incubus.

Holy shit.
J.W. Brewer says

October 28, 2018 at 9:24 am

1. FWIW, The Loom of Language is one of the books (found by me lying around the relevant section of a not-super-sophisticated-or-up-to-date U.S. public library in the late ’70’s) that helped pique my early interest in linguistics, and the same could well of been true of McWhorter, who is the same age as me and grew up barely 30 miles away from me. That it may be erroneous in matters of detail and/or reflect a prior scholarly consensus which was no longer the scholarly consensus even as of the now-long-ago date of its publication is just par for the course for popularizing works.

2. I try to get this thread off on the right foot with a remark about Esperantists, but everyone just wants to pile on Chomskyans again…
AntC says

October 28, 2018 at 9:28 am

This program now stretches over about 60 years. In that time GGs have made many many discoveries about …

Apologies for a dumb question/dragging this discussion (too arcane for me) back to the beginning of that 60 years:

Mary is eager to please. vs
Mary is easy to please.

Is IIRC a (nearly) minimal pair illustrating why we need to differentiate deep structure vs the surface syntax you might get from a Markov model or phrase-structure grammar. (Mary being the agent or the patient of the pleasing.)

Is Chomsky’s claim (in Syntactic Structures) correct that systems of grammar at the time couldn’t characterise the difference as syntactical?

Without field workers being attuned to such differences as syntactical rather than lexical: would they merely have produced more clumsy descriptive grammars; or would they fail to observe certain regularities in their subject languages?
David Eddyshaw says

October 28, 2018 at 9:31 am

The following assumption seems to be key to the whole argument.

in languages like French, the future tense is synthetic, yet semantically equivalent to its analytic English counterpart.

(Bolding mine.)
This is straightforwardly false.

@JWB:

I approve of your (doomed) thread-directing effort.
Those Esperantists, eh?! No wonder they couldn’t create a viable auxiliary language, when Universal Grammar had not yet even been discovered!
SFReader says

October 28, 2018 at 9:39 am

Keep hearing about this McWhorter fellow a lot on this blog, so finally looked him up on WP.

It turns out he wrote many books, but the only one I’ve read is “A Grammar of Saramaccan Creole”.

Very nice grammar by the way – lots of glossed examples, natural sentences (not made up) and the language is beyond fascinating – it’s unbelievable how utterly alien a variety of English can become in just three centuries.

Don’t remember anything else in the book – I just tend to skip all theoretical explanations anyways, because any important piece of grammar can be figured out from examples, surely.
David Eddyshaw says

October 28, 2018 at 10:15 am

McWhorter is most definitely on the side of the angels (as is anybody, in my book, who has written a decent descriptive grammar.) His polemical stuff, as bulbul rightly says, is unfortunately somewhat scattershot, but it’s not that there isn’t anything to be polemical about. Just have a look at any issue of Language and reflect on the fact that one of the first editors was Edward Sapir.

The theoretical bits in McWhorter’s Saramaccan grammar are actually well worth reading; it’s the sort of (indispensable) theorising that searches for understanding of the phenomena. (In particular, the tone system of Saramaccan is agreeably strange, and I don’t think it’s actually possible to describe it properly without a good bit of theorising about what tone systems can be like.) Nobody could accuse McWhorter of picking his data to fit a cherished set of theoretical preconceptions (the unforgivable sin of any scientific endeavour.)
David Marjanović says

October 28, 2018 at 10:27 am

the article linked by David M on the “Verb Raising Parameter”

That was Bathrobe just above my comment.

Here’s the New York Times quoting Tony Blair, certainly a native speaker, on the police killing of a Brazilian electrician mistaken for a terrorist:

That looks to me like it wasn’t planned: Blair decided to insert entirely after he had already said understand. I predict that the intonation was more like I understand – entirely! – the feelings. With a shorter object, I agree, he would have waited till after the object: I understand them entirely.

Holy shit.

Slightly worse than the German translation I read long ago, which turned “incubus” into “ballast”.

I found the book quite interesting in that it gave a broad overview about language families I had barely heard of, and likewise about typological phenomena and their possible relationships. But, yes, facepalm-inducing judgments everywhere.

Is Chomsky’s claim (in Syntactic Structures) correct that systems of grammar at the time couldn’t characterise the difference as syntactical?

I wonder if there are ergative-absolutive languages where the agent of “is eager to please” goes into the ergative. Neither syntactical nor lexical, then, but morphological?
David Eddyshaw says

October 28, 2018 at 10:47 am

One doesn’t need to get very exotic in looking for languages which construe “eager to please” differently form “easy to please”; in Latin, as in English, the constructions and meaning depend on the adjective in question.
From Gildersleeve and Lodge:

Puer studiosus est legendi. “The boy is zealous of reading.”
Aqua nitrosa utilis est bibendo. “Alkaline water is good for drinking.”

Chomsky’s “discovery” is surely just an artefact of his a priori doctrine that syntax is autonomous.
David Eddyshaw says

October 28, 2018 at 11:08 am

The altogether-awesome Cambridge Grammar of the English Language discusses stuff relevant to the “eager to please”/”easy to please” thing (pp1268ff), pointing out that retrieving the subject of the subjectless non-finite “to please” is a special case of anaphora, and is usually determined by semantic inference, except in the case of “control” (e.g. “this made it easy to understand” ) which is defined syntactically, though “strongly motivated by the semantics.”
Bathrobe says

October 28, 2018 at 11:08 am

I assume the spiel on Mary’s pleasing and being pleased was the same as that on the passive. The strict structuralist approach according to Chomsky was “just the facts, please, we don’t do meanings and we approach all grammar as a straight distribution exercise”. So:

John pleased Mary (insert tree diagram)

Mary was pleased by John (insert tree diagram)

were treated as totally unrelated structures with no intimation that the two sentences (not John and Mary) are intimately related. Indeed, Chomsky argued that both are derived (via transformation) from the same deep structure. I’m not sure how revolutionary this was given that Harris had already proposed something like transformations — and traditional grammar maintained there was a connection all along.

Incidentally, I continued reading Bulbul’s link to the paper on “openmindedness”, a practice that Norbert regarded as very bad if it made people doubt the truths arrived at by generative grammar. It was almost scary to read.
David Eddyshaw says

October 28, 2018 at 11:34 am

Upon mature reflection (huh!) it occurs to me that I have probably neatly missed Chomsky’s actual point, which was presumably (I haven’t read Syntactic Structures, and you can’t make me) that “easy to please” vs “eager to please” is not in fact a counterexample to his doctrine that syntax is autonomous but can be accommodated within it if you accept all his exciting transformational paraphernalia (since jettisoned by the man himself.)

This affects me rather like the elaborate treatment of the Latinate elements of English in The Sound Pattern of English, which sets up a really rather wonderful set of abstract underlying forms while scrupulously avoiding any mention of English orthography and how all literate speakers have been accustomed since childhood to render it into sound. It’s really deeply impressive, but why would you ever want to do that, when a much more natural explanation is sitting there right in front of your eyes?
languagehat says

October 28, 2018 at 11:39 am

Gotta make a name for yourself, get tenure, épater le bourgeois and écraser l’infâme, and make yourself one of the world’s best-known scholars. Then, my God how the money rolls in!
David Eddyshaw says

October 28, 2018 at 12:13 pm

@Bathrobe: too damn right about “scary.”

However, I think we’re dealing with an anti-Chomskyan infiltrator tasked with undermining the Chomskyans’ intellectual credibility:

I don’t much care about language. I care about FL and it’s structure. That’s what GG studies.

and

So, yes Scandinavian is interesting. I doubt the data as described, but I may be wrong. But, whatever the correct description, it will only make the problem harder, so I suggest for theoretical purposes we ignore it for now.
Lars (the original one) says

October 28, 2018 at 1:19 pm

@David Eddyshaw, I am not sure what difference “control” is making there — as far as I can see, “This made Tom eager to please” vs “This made Tom easy to please” still depend wholly on semantics to figure out what role Tom has in the pleasing. (I realize it’s not your claim, but I do not have access to Huddleston and Pullum (now cheaper in hardcover than on Kindle!)
Rodger C says

October 28, 2018 at 1:36 pm

Even that odd obiter dictum from Bodmer doesn’t approach, for sheer infuriating stupidity, Charlton Laird’s repeated expressions of contempt for the speakers of Spanish. And yet I also learned from The Miracle of Language a lot of neat stuff that encouraged further learning (along with a number of odd bits of misinformation, such as that the Old English w-letter was called wen–was his OE teacher a Southerner?).
SFReader says

October 28, 2018 at 2:01 pm

Given Bodmer’s opinion of Kyrilliza, I wonder what he had to say about Chinese characters or Arabic script.
David Eddyshaw says

October 28, 2018 at 2:13 pm

@Ur-Lars:

My fault, not CGEL’s, for sheer stupid carelessness. I just mechanically copied the wrong example (which isn’t an instance of “control” anyway.) I don’t think control is actually relevant for the constructions with adjectives at all; CGEL’s examples are e.g.

Kim wants to enter the competition/Kim wants me to enter the competition.

It just comes up in their discussion of the criteria for identifying the “missing subject” of the subjectless non-finite, as the exception to their statement that those are basically semantic.

CGEL is well worth the money. It’s even cheaper than the paperback of Miss MacIntosh, My Darling.
David Marjanović says

October 28, 2018 at 3:13 pm

A list of purported achievements of generative grammar in the same thread. Alas, I don’t understand them the way they’re presented.
bulbul says

October 28, 2018 at 3:19 pm

Regarding McWhorter: he really is excellent when he writes about stuff he knows, like in the aforementioned “A Grammar of Saramaccan Creole”, or in ” The Missing Spanish Creoles: Recovering the Birth of Plantation Contact” or even in “Defining Creole”, although that one is somewhat problematic, but within the usual confines of scholarly debate. That McWhorter is fine. McWhorter the pundit, on the other hand, really is an idiot. Consider this load of garbage, which features the following:

Arabic, again, isn’t easy, and Russian, spoken by countless millions, is so horrifically complex that part of me always wonders whether it is an elaborate hoax.

I hate to repeat myself, but ugh.
bulbul says

October 28, 2018 at 3:24 pm

David,

the list you link to is where the one I linked to was born 🙂

SFReader,

Bodmer spends a lot of time on Chinese and seems very well informed. He actually speaks highly of the Chinese script, noting – as many have – its advantages for a language with a lot of homophones and a lot of varieties. No similar comment is offered on Arabic.
David Eddyshaw says

October 28, 2018 at 3:43 pm

Interestingly, the #1 “Achievement”, “Island Effects”, is the very one where our correspondent advocates simply ignoring counterevidence from the Scandinavian languages.
Lars (the original one) says

October 28, 2018 at 3:45 pm

Hossenfelder — what is that all about? Sure, theoretical physics regularly throws up very pretty (symmetrical, group theory based, minimal) theories of everything that is (and more), but from what I see they spend just as much time shooting them down in flames by finding actual facts of nature that prove them wrong. Which is not entirely unlike the exact opposite of what we accuse GG of.
bulbul says

October 28, 2018 at 3:47 pm

Re David M’s analysis of Blair’s “understand entirely”: I concur. I went looking for the exact same construction and found 2:

“I understand entirely the concept you’re saying.”
COCA – spoken text, NPR Science; the speaker is identified as Charles Arntzen.

“They discussed between themselves what the best approach would be to allow the Admiral to understand entirely every significance contained in their report…”
Araneum Anglicum Maius (a web corpus, the site this is taken from is dead)
January First-of-May says

October 28, 2018 at 4:04 pm

Russian? Horrifically complex? Seriously?

It’s got nothing on Latin and Greek, and little on Finnish. And that’s only if we limit the comparison to European languages.
languagehat says

October 28, 2018 at 4:07 pm

Yeah, he should try Georgian on for size sometime.
Etienne says

October 28, 2018 at 4:16 pm

Bulbul (on your 5:53 comment):

Alas, hatred of empiricism/data is a disease which is most assuredly not confined to generativism within linguistics: this discussion right here at the hattery-

http://languagehat.com/those-darn-biologists-again/

-shows all too clearly that allergy to reality, within linguistics, is unconnected to the influence of the GGG (=Groovy generativist guru, as a fellow student of mine liked to call him). As the thread discussion should make clear, the mathematical model used to solve a problem (the spread of Indo-European) is more important to the “researchers” than whether their results actually correspond to historical reality.

I have seen the same thing in variationist sociolinguistics: while Labov is a fine scholar, a vast number of his followers have built careers upon arcane debates involving the interpretation of animal entr-err, I mean, of variable features in some language/dialect or other, the study of which had all too clearly become an end in itself, whatever its original (alleged?) purpose might have been. That is to say, any data that has NOT been recorded/studied by a variationist is treated is simply irrelevant: here again, what had once been a tool to study (some aspect of) language has become an object of study in and of itself.

I used to think that this complete empirical disconnect in various branches of linguistics was due to the fact that grant writers deliberately kept things vague in their project description in order to keep the gravy –err, I mean research funds– flowing: if it was made explicit in the grant form that such-and-such a finding would demonstrate the vacuity of the research being funded, well, that would make the funding far more liable to be cut than a project whose description does not make include any clear criteria whereby it could be shown to be false.

And while I still believe the above is part of it, I have met and spoken with so many otherwise intelligent and perceptive scholars who had so thoroughly internalized the principle that facts are irrelevant, who quite literally could not understand how data could prove a theory false, that it is now clear to me that there is a much deeper dynamic at work.

As I have already mentioned here at Casa Hat, I’m a big admirer of the work of John Michael Greer, and he has repeatedly hammered the point that declining civilizations (such as ours) become ever-more enamored of the abstract over the concrete: he discusses this here-

https://www.resilience.org/stories/2008-10-16/flight-abstraction/

-and I suspect that linguists and indeed scholars, in general, are like economists: they have come to regard systematic engagement with facts as something beneath them, as something incompatible with a serious science.
TR says

October 28, 2018 at 4:26 pm

What I wish someone would explain to me is what the ontological status of trees, movement and so on is supposed to be. Are they supposed to stand for actual entities and processes in the brain? If so, what’s the evidence for that? If not, what is their explanatory value? There doesn’t seem to be a consensus on the answer even among generativists: I once heard Mark Hale say that of course movement is a metaphor because otherwise you could ask “How fast is it moving?” (he didn’t explain what it was a metaphor for); but when I asked Chris Golston this question he said of course it’s a real process in the brain and there’d be no point in studying it if weren’t.
Brett says

October 28, 2018 at 4:30 pm

I resisted the urge last night to write something snarky about Sabine Hossenfelder. Because she is a very nice person as individual, but as a physicist, she has also proved to be completely immune to irony. That she has written a book about this topic just further illustrates the point.
bulbul says

October 28, 2018 at 4:33 pm

Etienne,

the discussion … shows all too clearly that allergy to reality, within linguistics, is unconnected to the influence of the GGG
Well, sure. But you said “mainstream linguistics is basically theory-driven now” and this is what I objected to and I still do, unless you want to describe the kind of pseudo-computational* approach described in the discussion you refer to as mainstream linguistics, my point still stands. I, for one, would not even describe it as linguistics.

* I say “pseudo-” because any computational approach that does not take into account the old “garbage in, garbage out” principle is just wanking over code.
bulbul says

October 28, 2018 at 4:34 pm

Brett,

care to elaborate? Because insistence that someone mounting a critique of something just doesn’t understand irony/get the joke is a pretty common bullshit argument. Hic Rhodus, hic salta.
David Eddyshaw says

October 28, 2018 at 5:03 pm

what the ontological status of trees, movement and so on is supposed to be

It’s a significant symptom in itself that initiates seem either to think this is not a legitimate question at all, or that they are so evidently right that a physical basis (of some sort) for it all must inevitably appear. One day. Because it just must. And if it doesn’t, that’s just because biology is in a primitive state compared to generative grammar and just hasn’t caught up yet. We’re going to need a bigger microscope …

I suppose if you have committed yourself to constructing something so abstract that mere data from actual languages (or biology, for that matter) can be airily dismissed if they cause problems with the beautiful theory, you’re likely to have a pretty Platonist outlook. Or, to be less elevated, SFReader is correct and this is just pornography for the mind.
Brett says

October 28, 2018 at 5:16 pm

@bulbul: That’s not what I meant by “immune to irony.” In fact, I don’t think I have ever heard it used that way. How this expression is interpreted probably depends on the reader’s default meaning of “it” is. Your interpretation seems more natural with the typical British use of “irony” to mean “sarcasm” or “satire.”

What I meant was that Hossenfelder has been a disciple and protege of Lee Smolin, involved in harsh criticisms of some theoretical programs, with no better results (or reason to expect better future results) from their own preferred approach. Smolin got some popular press for talking about how too many theoretical physicists were wasting time doing complicated calculations in theories that were just incremental extensions of the standard model of particle physics. Instead, he and others in his circle advocated more abstract theorizing, aimed at guessing the deepest principles underlying the physical world. Hossenfelder was part of this, including the group’s advocacy for Loop Quantum Gravity (which was supposed to be more physically grounded, yet somehow also more profoundly new) than string theory. In fact, both theories are equally divorced from anything realistically predictive.

One time, years ago, she posted something incorrect as a comment on a science blog, and I posted a couple paragraphs as a rejoinder, explaining why her expression for the velocity was wrong. She then wrote an entire paper, confirming my rebuttal with explicit calculations, yet somehow arguing at the end that her original interpretational viewpoint, which had initially led her to the wrong answer, was still correct, all the while criticizing string theory as untestable maundering.
David Eddyshaw says

October 28, 2018 at 6:06 pm

Actually, the overwhelming impression of the List of Achievements is “Sixty years – and that’s all you’ve got?”

I’m interested in #8: “Functional Material Doesn’t Incorporate [Higher functional structure such as determiners and complementizers doesn’t incorporate into superordinate lexical heads], because in Oti-Volta languages demonstrative determiners regularly compound with their lexical heads. I suspect that the (inaccessible) thesis cited as a source for this uses “incorporate” in some special technical sense which would make this quite irrelevant, but it caught my eye anyway. I strongly suspect on first principles that the author did not take much account of Oti-Volta data in framing this universal rule, however. Could be wrong …

Ultimately, I suppose it would mean nothing more than that Oti-Volta speakers share with Scandinavians the distinction of not speaking Human. It could be the result of all those Viking expeditions up the Niger river.
Etienne says

October 28, 2018 at 6:46 pm

Bulbul: I totally agree that the work discussed here at the Hattery which I linked to is pseudo-linguistics. But it is no less a specimen of pseudo-linguistics than most generative or variationist sociolinguistic “work”. And inasmuch as it was being trumpeted by the NEW YORK TIMES it is “mainstream linguistics”, in visibility at any rate. And unfortunately I have seen conference talks (including a couple of invited/plenary talks) applying mathematics to linguistics that were about as ill-informed, so I’m afraid I must treat it as a typical specimen of the genre.

On McWhorter as a pundit: I think that, as a native speaker of English whose area of specialty within linguistics is pidgin + creole languages, it is unsurprising that languages with plenty of inflectional morphology (you know, such as Arabic or Russian…) strike him as unusually complicated -and, let’s face it, as an Arabic scholar who is a native speaker of a Slavic language your bias, when it comes to evaluating the complexity of Arabic and Russian, is pretty much a mirror image of his.

Speaking of which, I’m curious: how difficult is Russian from the vantage point of a Slovak L1 speaker? The stress system must be a major hurdle: are there others?
ktschwarz says

October 28, 2018 at 7:15 pm

Re adverbs coming in between verbs and their objects, no need to look far: there’s one right up top in Miner’s post itself, “trained linguists who speak natively the languages they are examining”. That nobody pointed this one out until now is perhaps evidence that the construction really does sound natural. My naive belief that adverbs can’t go there has been shattered. Thanks to David Eddyshaw for that, and I agree with his judgments on “I do respect very much the elephant”.
David Marjanović says

October 28, 2018 at 7:18 pm

Bodmer spends a lot of time on Chinese and seems very well informed. He actually speaks highly of the Chinese script, noting – as many have – its advantages for a language with a lot of homophones and a lot of varieties.

The unusual number of homophones is an artefact of pronouncing Classical Chinese as modern Mandarin. The varieties are no better covered by the Chinese script than the Tibetan varieties are covered by the Tibetan orthography… not that I’d recommend the latter (basically like French, but on both ends of every syllable) as a model for anyone to emulate.

The German translation I read presents an early (1950s) draft of Pinyin. Is that already in the original? (The translation contains other substantive changes, like replacing the chapter about German with one about English.)
David Marjanović says

October 28, 2018 at 7:40 pm

From Greer’s Flight to Abstraction:

Vico’s argument is complex and difficult to summarize, but one of its core themes – the one whose relevance to the present struck me most forcefully that night in Las Vegas – is the role of abstraction. A wide range of social phenomena, Vico pointed out, focus entirely on specific concrete realities in the early days of a culture, and evolve toward abstraction over the lifespan of the culture. Law codes start out as lists of rules for specific cases, and broaden into statements of principles covering infinite variation in practice; words leave behind concrete meanings – how many people nowadays recall that the verb “understand” once meant literally “to stand under,” in the sense of upholding or supporting something? – and take on ever more nuanced meanings; religion begins in the shattering impact of the numinous on individual lives, and diffuses into elegant theological notions disconnected from the realities of human experience.

Sure, such things happen, but so does the opposite, particularly perhaps in language. Take thing. Sure, “piece of solid matter” is a rather abstract meaning, but it’s a lot more concrete than the previous meanings of “affair” and “subject of discussion”.
January First-of-May says

October 28, 2018 at 7:41 pm

The varieties are kind of covered in that you can kind of write (Mandarin or) Cantonese or Hakka or whatever in CJK ideographs (as Unicode calls them), the same way you can kind of write Japanese or Korean or Vietnamese in them.
Granted, when read, it’s going to result in some darn stilted Vietnamese, Korean, Japanese, Hakka, or Cantonese. (And probably even Mandarin, for that matter.) But it’s possible, and people have actually done that (in all of those cases, IIRC).

As for the homophones, they’re indeed more of a Classical Chinese problem than a modern Chinese problem – most of them are really obscure characters, and/or really obscure meanings of characters that usually mean something else.
(I’m actually more confused about how do the French manage to deal with their homophones.)
Trond Engen says

October 28, 2018 at 8:03 pm

David Eddyshaw: I’m interested in #8: “Functional Material Doesn’t Incorporate [Higher functional structure such as determiners and complementizers doesn’t incorporate into superordinate lexical heads], because in Oti-Volta languages demonstrative determiners regularly compound with their lexical heads. I suspect that the (inaccessible) thesis cited as a source for this uses “incorporate” in some special technical sense which would make this quite irrelevant, but it caught my eye anyway. I strongly suspect on first principles that the author did not take much account of Oti-Volta data in framing this universal rule, however. Could be wrong …

Isn’t the development of the definite article in Scandinavian a blatant case of incorporating a determiner or complementizer into a lexical head?

Ultimately, I suppose it would mean nothing more than that Oti-Volta speakers share with Scandinavians the distinction of not speaking Human.

Oh, maybe that’s your point. I’m afraid I didn’t bother to read the list.

It could be the result of all those Viking expeditions up the Niger river.

Not the Niger. The Congo.
Brett says

October 28, 2018 at 8:15 pm

It occurs to me that the proponents of generative grammar might actually add a great deal to our understanding of linguistic structure if they were willing to look at questions as statistical in nature. The doctrinaire notion that logical structure ought to provide absolute rules is a real problem. It reminds me of Karl Popper’s difficulty in grasping that many scientific questions were not going to tested with experiments that would give clear-cut yes-or-no answers. Popper initially rejected many important scientific theories (including quantum mechanics), because they only talked about the probabilities of outcomes, rather than giving decisive criteria for whether an event will or will not happen.

If the generative grammarians were interested in looking at real quantitative usage data, it could be very interesting. By looking at corpus data, a thesis such as: “It is relatively difficult to embed head-final projections in head-initial ones” (quoted from above) could be put to an empirical test. How “difficult” this is could be quantified by comparing how commonly different constructions appear in real language. Strict logical connections between constructions could be replaced with fuzzy, probabilistic logics. The extent to which a passive construction like, “Mary was pleased by John,” is governed by the same rules as, “John pleased Mary,” is an empirical one; and the answer ultimately ought to tell us something about the heuristic operations used by our brains to determine meaning.
David Eddyshaw says

October 28, 2018 at 8:16 pm

@Trond:

You Scandinavians have already been excluded from the family of Speakers of Language because you don’t do Island Effects properly. Incorporating determiners is just the last straw. What is with you people?

It would appear that the Niger-Congo languages as a whole are Scandinavian.

@Brett:

That is a very interesting idea. A lot of typology is like that, after all. It’s just not true that languages can’t be OVS, for example, but it still surely must mean something that so very few are.
Martin Haspelmath has written something along these lines, I think: somebody actually linked to it recently but I’ve forgotten who exactly (and I have to go to bed now in honour of GMT.)
Bathrobe says

October 28, 2018 at 8:55 pm

Norvin hasn’t turned up to answer my question about trees or even comment on questions of their ontological status. And unless I’m mistaken, he appears to be the author of an entire book about them.

It is my impression that the structuralists used to talk about structure a lot but didn’t necessarily formalise it in tree diagrams. It was the generativists who went in for trees in a big way. Trees took on a life of their own, as it were, and ever since the favourite generativist sport has been manipulating them this way and that (like genetics gone wild) or, since Chomsky realised that that wouldn’t do, pruning them this way and that (like genetics gone bonsai) to explain how language works.

I note that bulbul concurred with the notion that “English lowers affixes to verbs, while French raises verbs to affixes”. I’m still puzzled as to how such a notion can be supported by reference to anything but the manipulation of trees. I’m sceptical but open to persuasion.
minus273 says

October 28, 2018 at 10:13 pm

I’m still puzzled as to how such a notion can be supported by reference to anything but the manipulation of trees.

Probably in conjunction with the notion that there is a universal linear order of sentence constituents.
AntC says

October 28, 2018 at 10:22 pm

Norvin hasn’t turned up to answer my question about trees or even comment on questions of their ontological status.

Yes, I’ve laboured through the discussion since last night (my time), hoping to see an answer. (All fascinating, even without answers: thank you Hatters.)

In discussing artificial grammars for computer languages, we tend to use anthropomorphic terms to model the semantics (‘remember’, ‘know’, ‘hide’, ‘point to’), but there’s no suggestion the models correspond to anything inside brains — human or electronic.

I’d also like to understand the ontological status of claims like Arguments #9

Null subjects [Many languages allow pronouns to be unpronounced in certain positions under certain conditions. Where possible, these pronouns act much like overt pronouns …]

Again there’s the weasel words “under certain conditions”/”where possible”/”much like”, which makes this claim untestable.

I can understand ‘contracted’ or ‘elided’ elements. I kinda understand claims like: many languages omit the verb to be in simple present. [Test that as: if no finite verb, insert to be, to make it match the pre-determined tree.] But “unpronounced” something “act … like”? I.e. with no phonological residue whatsoever and yet mentally present? Is that the allegation?

This sounds too much like the man I met upon the stair. I wish I wish he’d act elsewhere. “act” like how?
John Cowan says

October 28, 2018 at 10:31 pm

there’s one right up top in Miner’s post itself

Yes, well, everybody knows that linguists Know Too Much to be good informants. That is, everybody considered with facts in linguistics knows it: for the others, it seems they are themselves the best possible informant, since they are concerned solely with what’s possible, and if they can say it, it is obviously possible.

Take thing

The original meaning is ‘meeting, assembly’, which is admittedly not as concrete as man or horse or river, but far more so than any of the later meanings.

Not the Niger. The Congo.

Historical linguists routinely conflate these rivers, speaking of the Niger-Congo. And the Japanese in Victorian times not only conflated them but placed them on the continent of Asia: “Yes, I like to see a tiger / From the Congo or the Niger / And especially when lashing of his tail!”

how such a notion can be supported by reference to anything but the manipulation of trees

It is metaphorical, and the metaphor can only be grasped as such in connection with the metaphorical discussion of trees, including the idea of heavy vs. light branches. But at the more concrete level, it is just a statement about how you can’t respect in English very much the elephant, and as such is true or false accordingly.
Bathrobe says

October 28, 2018 at 11:03 pm

@ David Eddyshaw

However, I think we’re dealing with an anti-Chomskyan infiltrator tasked with undermining the Chomskyans’ intellectual credibility:

Norbert now runs that blog (see latest postings).
Bathrobe says

October 28, 2018 at 11:27 pm

It is metaphorical, and the metaphor can only be grasped as such in connection with the metaphorical discussion of trees, including the idea of heavy vs. light branches. But at the more concrete level, it is just a statement about how you can’t respect in English very much the elephant, and as such is true or false accordingly.

Everyone knows that word order is different between English and French. That is not controversial. What Norbert is claiming is that GG has discovered the secret that lies behind this word order. That is, “English lowers affixes to verbs, while French raises verbs to affixes”. That is the specific formulation that makes GG different from all the rest and entitles GG to claim this as a discovery.
tangent says

October 29, 2018 at 1:50 am

Syntacticians are a numerical minority of linguists, I’ll say boldly extrapolating from a couple of departments. Why is it that they’re 98% of the story of “linguistics, how is it nowadays”? (The other 2% is those sociolinguists who hate proper English.)

For popular discussion, my theory is that “grammar” is a concept people already have, so its incongruity with the field the syntacticians have made, that is fun or enraging or otherwise rakes in the clicks. Whereas people in general don’t know phonetics is a thing and would take much explaining what it’s for. (Actually much easier to explain than what syntax is for!)

For discussion among cognoscenti — I don’t know, what’s your theory?

(Just that P-field linguistics is doing its job and unremarkable to point at? My opinion is showing.)
SFReader says

October 29, 2018 at 2:16 am

homophones

A few years ago an English teacher in Utah was fired for writing a blog post about homophones, because his boss thought he was promoting homosexuality.
Bathrobe says

October 29, 2018 at 3:08 am

For popular discussion, my theory is that “grammar” is a concept people already have, so its incongruity with the field the syntacticians have made, that is fun or enraging or otherwise rakes in the clicks.

I don’t really think that syntacticians rake in the clicks. The general public doesn’t care about syntax, unless it concerns the ‘prescriptivism/descriptivism’ debate. And such peevers devote a disproportionate amount of attention to spelling and punctuation.

People have peeves about pronunciation, too, but most of them wouldn’t know what a phoneme was if it bit them on the nose.

I suspect that syntax is interesting to people because it impinges on issues of how we write, a strong marker of civilised behaviour. People take an interest in pronunciation for similar reasons, but because people are familiar with a wide range of accents they are tolerant of regional differences — less so class differences.
David Marjanović says

October 29, 2018 at 12:23 pm

I’m actually more confused about how do the French manage to deal with their homophones.

Many of them are singular vs. plural nouns, distinguished in writing by an -s that hardly ever surfaces in pronunciation. Here it’s easiest for me to think that spoken French has completely outsourced noun plural marking to the article, and children learn a whole new grammar when they enter school. In other words, such singular/plural pairs are homophones in written French, but simply the same word in spoken French.

Evidence for this is richly provided by this blog in French, where the -s of noun and adjective plurals is very often, and seemingly randomly, omitted, despite the scarcity of other typos and absence, as far as I can tell, of other grammatical mistakes. The author is an accomplished comparative linguist, fluent in English where I’ve never noticed him to drop -s, as well as in a bunch of scarier languages, so it’s not like he has trouble with the concept in the abstract. And yet, when he types fast while thinking in his native language, the plural marker doesn’t always make it in.

Isn’t the development of the definite article in Scandinavian a blatant case of incorporating a determiner or complementizer into a lexical head?

The claim is about a demonstrative determiner: not “the”, but “this”. I, for one, don’t know of a language that compounds nouns with demonstratives that have not clearly bleached out to articles first.

But “unpronounced” something “act … like”? I.e. with no phonological residue whatsoever and yet mentally present? Is that the allegation?

Why not? That would be an ordinary zero morpheme, just restated for syntax instead of morphology.

I can present an example other than the usual pronouns. In my dialect, and in Viennese mesolect, the m. sg. acc. article den (unless stressed for use as the demonstrative pronoun) generally surfaces as /ɪn/ or /n/ (with place assimilation to surrounding consonants). Yet, neither of these forms ever occurs directly after in (“into”). The only “form” “found” there is total absence. Given the frequency and distribution of long consonants elsewhere, you’d expect in den to surface at least as */ɪnː/, but it never does. No other article can disappear like that, not even optionally; it’s just the masculine singular accusative. The only explanation I can come up with is that my introspection is correct and speakers have the article in mind but don’t pronounce it right after in.

(OK, I could just write “m. acc.” because, like elsewhere in German, the genders aren’t distinguished in the plural. And the dat. pl., Standard den, has been replaced by the acc. pl., die.)

Syntacticians are a numerical minority of linguists, I’ll say boldly extrapolating from a couple of departments. Why is it that they’re 98% of the story of “linguistics, how is it nowadays”?

Brainstorming:

1) English grammar is mostly syntax, so that’s the most visible part of grammar in English, today’s global academic language.
2) The MIT is famous enough to be noticed by the media, and presumably has a competent press office.
3) The aim of GG is nothing less than to explain language, including but not limited to explaining why some features are (supposedly) seen in all languages and some in none. The claim “we’ve reached the point where we can do that now” is definitely newsworthy, and it’s easy enough to explain to the public the way I just did.
4) The parts that are not easy to explain to the public look rather mathematical, i.e. rigorous and sciency.
5) Self-fulfilling prophecy: department heads and funding agencies thinking “this is the exciting new direction of research, so that’s what we need to do/finance now”.
January First-of-May says

October 29, 2018 at 12:40 pm

I, for one, don’t know of a language that compounds nouns with demonstratives that have not clearly bleached out to articles first.

…how did Bulgarian articles develop again?
David Marjanović says

October 29, 2018 at 12:58 pm

Postposed demonstrative pronouns becoming articles and then fusing to the nouns? Probably that’s not even testable, because this fusion is such an ill-defined drawn-out process. That’s why I wrote “clearly”.

There doesn’t seem to be a language where “clearly” fused demonstratives retain that meaning and have not becomes articles.
David Eddyshaw says

October 29, 2018 at 1:56 pm

I, for one, don’t know of a language that compounds nouns with demonstratives that have not clearly bleached out to articles first.

Kusaal, and indeed all its close relatives, unequivocally do exactly that. Adjectives and demonstratives compound with a preceding noun which is reduced to a bound form. This is completely regular, obscured only by the fact that a minority of such bound forms have been remodelled segmentally (never tonally) on the basis of the singular.

biig “child”, biis “children”
nid “person”, nidib “people”

bikanga “this child”, bibamma “these children”, ninkanga “this person”, ninbamma “these people.”

The forms bi and nin cannot stand alone without a following adjective, noun or pronominal. The same forms are used as the first element of head-final compounds ninkuud “person-killer”, i.e. “murderer.”

The article is a quite different word: postposed la,which doesn’t inflect and is preceded by unbound sg/pl forms.
Trond Engen says

October 29, 2018 at 2:11 pm

The claim sounds to me like a corollary of the old observation that semantic bleaching goes before cliticization.

I think the common (but not the only) explanation for Scandinavian definite articles is that they developed from demonstratIves used as relativizers. hús, hitt (svá) ek byggði > huset (som) eg bygde. I agree that it’s not the same as complementizers, but it’s not articles either.
David Marjanović says

October 29, 2018 at 3:23 pm

Adjectives, too? Amazing.

I agree that it’s not the same as complementizers, but it’s not articles either.

The PIE relative pronoun participates in the definite adjective endings of Balto-Slavic (which have undergone various meaning shifts at least on the Slavic side since then).
David Eddyshaw says

October 29, 2018 at 3:49 pm

Yup, adjectives too:

bisung “good child”, bisuma “good children”, ninsung “good person”, ninsuma “good people”

and moreover

bisungkanga “this good child”, bisumbamma “these good children”, etc etc

The sg/pl suffixes in Oti-Volta come in pairs, more or less, for count nouns, and in most Oti-Volta languages these go with grammatical genders, with different 3rd person pronouns obviously etymologically connected with the suffixes. Historically, the adjective stem was basically infixed between a noun stem and its class suffix, but this doesn’t work as a synchronic description for most of the languages (Kusaal has actually abandoned grammatical gender altogether, though there are scattered fossil forms still lying about as isolated bits of morphology.)
David Eddyshaw says

October 29, 2018 at 4:20 pm

Regular compounding of adjectives and their head nouns is found outside Oti-Volta, for example in the Gurunsi branch of Gur, (seen in Jonathan Brindle’s grammar of Chakali) and even in Supyire (part of the Senufo family, not now usually included in Gur. Robert Carlson wrote a grammar of it in the Mouton Grammar Library series.) I haven’t found any languages outside Oti-Volta that do the same thing with demonstratives, but unlike the GG people I am not in a position to declare that there aren’t any without even needing to look.
languagehat says

October 29, 2018 at 4:52 pm

I envy you your knowledge of Kusaal; what an interesting language! (And what a useful source of counterexamples to claimed universals!)
David Eddyshaw says

October 29, 2018 at 4:57 pm

I think I was misrembering when I said that somebody had linked to Martin Haspelmath’s paper on parametric versus functional explanations; anyway, here it is:

https://zenodo.org/record/1252267
David Eddyshaw says

October 29, 2018 at 5:10 pm

It didn’t actually occur to me that Kusaal evidence refutes this “universal” because I wasn’t aware it had been claimed as one; it is actually an odd feature, now I think of it, but it’s so much of a piece with how Kusaal and its relatives use nominal compounding all over the place where familiar languages have separate words* that the strangeness of having freely formed compounds meaning “this tree” or whatever never really struck me. I should read more typology.

Kusaal also refutes what has sometimes been claimed as a universal, viz that internally headed relative clauses are only found in SOV languages. But that one seems pretty wobbly anyway.

Haspelmath’s paper is about linguistic universals, by the way. I was going to go back and edit my post to say so but I didn’t want to consign it to Moderation Purgatory.

*”Word” is a slippery concept in this context. I ought to come clean and say that in the Masterwork I actually treat noun combining forms as bound words rather than word fragments, so tikanga “this tree” (so written in traditional orthography) is regarded as two “words” ti-kanga, but I don’t think it makes a fundamental difference: the point is that they’re bound. You can’t say ti for “tree” without a following noun, adjective or pronominal, only tiig.
David Eddyshaw says

October 29, 2018 at 6:18 pm

(Sorry for the logorrhoea. I’m just excited. I’m conscious that this is something of a niche interest.)
Thinking about it, in fact, the undeniable weirdness of compounds meaning things like “this tree” is a good argument in favour of my existing contention in the Great Tractate that nominal combining forms should be regarded as (grammatical) words. I shall duly incorporate it. So to speak.

It’s still a counterexample to the supposed universal. Though to help any passing GG-men wanting to explain it away, I can say that combining forms used as NP heads are notably more prone to segmental remodelling after the singular form than combining forms used as NP dependents, with an increasing trend to remodelling actually traceable in texts over the past forty years, so there is a reasonable argument that the language has repented of its lawless ways and is at least trying to conform to the Universal. Which is interesting in itself.
Etienne says

October 29, 2018 at 6:44 pm

David Eddyshaw: Err, what on earth are you apologizing for? Logorrhea, say you? Pshaw! This is F-A-S-C-I-N-A-T-I-N-G! I am certain that I speak for many Hattics when I say, when it comes to your introducing us to, and giving us examples of, Kusaal grammar: More, more, I’m still not satisfied!
(With apologies to Tom Lehrer lovers).

Here’s a chunk of a comment of mine I made here about three years ago (“How the Cherokee language has adapted”) which is quite relevant to the present thread (more relevant than where I first wrote it, in fact):

“A related story which some dwellers of the Hattery may find amusing and informative: I was chatting and drinking with a scholar of Athabaskan languages once, and after a few drinks we both found that we had a common distaste of data-free theoretical linguistics, leading said Athabaskanist to tell me that the frustrating thing about being an Athabaskanist in a Department of Linguistics dominated by theoreticians was the knowledge that data from various Athabaskan languages could easily knock down ALL the competing theoretical schools within the Department…indeed, after yet a few more drinks, we came up with a great idea for a monograph in cultural anthropology: a comparative study of Shamans in traditional Athabaskan-speaking communities versus theoretical linguists: which of the two groups is the more impermeable to empirical evidence? I think you may guess what we thought the conclusion of such a study would be…”
AntC says

October 29, 2018 at 7:02 pm

What Etienne said.
TR says

October 29, 2018 at 7:23 pm

Thirded.
languagehat says

October 29, 2018 at 7:25 pm

I’m conscious that this is something of a niche interest.

Not around here it’s not!
David Eddyshaw says

October 29, 2018 at 7:46 pm

Thanks, all. I feel pretty comprehensively validated. You will only have yourselves to blame for what follows …
There is a reason for the old Akkadian proverb: “Never validate an Eddyshaw.” (ə de vivre will be able to provide the original; I paraphrase.)
Bathrobe says

October 29, 2018 at 7:56 pm

we came up with a great idea for a monograph in cultural anthropology in cultural anthropology: a comparative study of Shamans in traditional Athabaskan-speaking communities versus theoretical linguists

A paper setting out data from various Athabaskan languages that knocked down ALL the competing theoretical schools would have been more apposite. So where is the paper?
January First-of-May says

October 29, 2018 at 7:58 pm

I’m conscious that this is something of a niche interest.

To be fair, so is Language Hat itself, and when filtered through “people likely to show up at Language Hat” your niche becomes proportionally a lot larger.

(I try to recommend Language Hat to everyone I meet online who seems to be interested in linguistics, but this seems to almost never work.)

A paper setting out data from various Athabaskan languages that knocked down ALL the competing theoretical schools would have been more apposite. So where is the paper?

Well, to be fair, who do you think is going to pay for its publication? Nikolai Marr?
David Marjanović says

October 29, 2018 at 8:08 pm

“Never validate an Eddyshaw.”

Nobody will confuse you with Ea-Na”s”ir. Please carry on. 🙂
David Eddyshaw says

October 29, 2018 at 8:12 pm

Sanskrit has

https://en.wikipedia.org/wiki/Tatpurusha

now I think of it; but then Sanskrit (properly so called) is a sort of exuberant conlang anyway, so I’m not sure how relevant that is. It doesn’t seem to be a productive formation with pronouns anyway. I don’t know nearly enough about Sanskrit than to throw the matter open for consideration …
Trond Engen says

October 29, 2018 at 8:27 pm

David E.: biig “child”, biis “children”
nid “person”, nidib “people”

bikanga “this child”, bibamma “these children”, ninkanga “this person”, ninbamma “these people.”

And later: bisung “good child”, bisuma “good children”, ninsung “good person”, ninsuma “good people”

and moreover

bisungkanga “this good child”, bisumbamma “these good children”, etc etc

In the language of Ofrãse, spoken a bit further north, the demonstrative is bound to the head noun in a way that’s parallel to the article. Numbers and adjectives are also compounded.

ãfã “child”, dezãfã “children”, døzãfã “two children”
lãfã “the child”, lezãfã “the children”, ledøzãfã “both children”
sãfã “this child”, sezãfã “these children”, sedøzãfã “these two children”

bõnãfã “good child”, debõzãfã “good children”, døbõzãfã “two good children”
ləbõnãfã “the good child”, lebõzãfã “the good children”, ledøbõzãfã “both good children”
səbõnãfã “this good child”, sebõzãfã “these good children”, sedøbõzãfã “these two good children”.

Unlike Kuusal, the Ofrãse bound forms are marked by an initial consonant.
David Eddyshaw says

October 29, 2018 at 8:35 pm

Ofrãse is evidently, like Niger-Congo, part of Macro-Scandinavian.
David Marjanović says

October 29, 2018 at 8:43 pm

sãfã “this child”

Alas, cet enfant ~ sɛtãfã.

Numbers and adjectives are also compounded.

How can you tell they’re compounded?
Trond Engen says

October 29, 2018 at 8:57 pm

David M.: Alas, cet enfant ~ sɛtãfã

Yes, stupid error. I don’t think it matters, though. It’s still a demonstrative.

How can you tell they’re compounded?

How can David E. tell for Kuusal? Prosodic units and separate bound forms of the head word?

David E.:

tat-puruṣa = “that-man” in the sense of “that person’s man”. (genitive)

Compound with demonstrative:: Norw. arch. hinmannen “(lit.) the-other-man; (euphemistic) the devil”.

(I googled to check if it’s used in Danish. Seemed so, with a few douzen hits, but all of them are quite funny misspellings for ‘hinanden’ “eachother”. (“My husband and I have known the devil for 8 years.”, “We were seven people walking El Camino together, helping the devil across rivers and streams.”)

(I still don’t get what the compounds with demonstratives thing is about, though, so I’m just making noise.)
John Cowan says

October 29, 2018 at 9:01 pm

He can tell they are compounded because that is what Theory predicts, of course.
Bathrobe says

October 29, 2018 at 9:23 pm

@ Trond Engen

And you missed sɛtãfãsi and sɛtãfãla.
languagehat says

October 29, 2018 at 9:29 pm

I think this thread will benefit from the wisdom of Kozma Prutkov:

Рассуждай токмо о том, о чем понятия твои тебе сие дозволяют. Так: не зная законов языка ирокезского, можешь ли ты делать такое суждение по сему предмету, которое не было бы неосновательно и глупо?

Reason only about that about which your understanding permits you to reason. Thus, not knowing the laws of the Iroquois language, can you form an opinion concerning that subject which would not be unfounded and foolish?
January First-of-May says

October 29, 2018 at 9:42 pm

This is so incredibly fitting that I actually had to google whether it is actually from Kozma Prutkov.

(It is.)
languagehat says

October 29, 2018 at 9:47 pm

Never doubt the wisdom of Kozma Prutkov.
dainichi says

October 29, 2018 at 9:53 pm

In Standard Japanese, あの人 /anohito/ (“that person”) is usually pronounced with a pitch drop after あの /ano/ (“that”), whereas regular pronunciation rules would predict no pitch drop. I think that constitutes some sort of integration of a demonstrative.
David Eddyshaw says

October 29, 2018 at 10:02 pm

How can David E. tell for Kuusal?

First of all, I have to say that I can’t come up with a way of demarcating compounds by stress in Kusaal. That may well be largely because I don’t have an adequate theory of Kusaal stress; I’ve actually recently changed my presentation in the grammar to make this more explicit (partly because stress is mostly significant in practice in that it affects the realisation of tonemes, and I realised I could still account for the tonal phenomena while factoring out most of the detail of stress, including, to my joy, all the stuff that I wasn’t really clear about.) Stress (probably) doesn’t systematically differ in its allocation between syntactically bound and unbound words; among other things, this has led me to see a bit late in the day that I was using the term “clitic” in a somewhat question-begging way (and I’ve mended my ways.)

However, the first elements of nominal compounds show distinctive tonal behaviour, and morphologically although there has been a fair bit of remodelling, it remains transparent that they are basically bare stems, without the sg/pl “noun class suffixes” which are needed to make a nominal word which is capable of standing before pause. Significantly, the same combining form appears (in general) whether the form is a head or a dependent: so from biig “child”

bikanga “this child”, bipielig “white child”

but also

bikuud “killer of children”, bifuug “a children’s shirt” (i.e suitable for children, as opposed to biig fuug “a shirt belonging to a child”)

The matter is slightly complicated by the fact that combining forms used as heads are more prone to segmental remodelling, but that is a relatively marginal thing, confined to a few words which have very short potentially ambiguous combining forms (not that infrequent in general, because combining forms not only lack flexions but are, on top of that, subject to the pervasive Kusaal deletion of word-final underlying vowel morae in most but not all contexts which does so much to make Kusaal morphophonemics more complicated.)

(Just to make life more complicated still, Kusaal perfective finite verb forms behave in a lot of ways from a phonological standpoint both segmentally and tonally like words bound to the right. But they aren’t … )

I do (as I said above) regard combining forms as “words” rather than word fragments (which is how the traditional orthography writes them), but in my view it really makes no great difference what term you use: the distributional facts are the same. I can (and have) come up with several different “theories” for this and other facts of Kusaal grammar which are totally equivalent in terms of what they predict; I can see no earthly reason to prefer one over another other than descriptive clarity, and if I’m feeling pretentious that day, elegance. I’m a nominalist.

Kusaal compounds can actually contain components which are unbound. All suggestions gratefully received …
David Eddyshaw says

October 29, 2018 at 10:33 pm

I can see no earthly reason to prefer one over another other than descriptive clarity

Although I should also say that “descriptive clarity” entails not making your presentation gratuitously idiosyncratic; if you want to communicate your insights (if any) it’s helpful to couch them in terms of theories likely to be already familiar to your punters. Good theories conduce to communication. (I am preaching to myself at this point, having offended mightily in the past.)
bulbul says

October 30, 2018 at 3:50 am

Etienne,

And inasmuch as it was being trumpeted by the NEW YORK TIMES it is “mainstream linguistics”, in visibility at any rate.
The NYT does not get to determine what mainstream linguistics is, insofar as such a thing exists. Your statement concerning the theory-driven nature of modenr mainstream linguistics therefore is and remains wrong.
January First-of-May says

October 30, 2018 at 7:27 am

Never doubt the wisdom of Kozma Prutkov.

Well, duh, but I didn’t expect him to have anything on linguistics.
David Eddyshaw says

October 30, 2018 at 8:16 am

Thinking about it a bit more, while I wish to retain my membership card for the “Universals? What Universals?” Club, I think this particular one (no incorporation of demonstratives with lexical heads) can be salvaged in pretty good faith at least as far as Western Oti-Volta is concerned by turning it round: if you take the Universal as a sort of axiom, then you can use it to tell you something about what compounding actually is in these languages, as a piece of evidence that it’s rather out of the ordinary. And actually this is certainly true enough; quite apart from the compounding of adjectives and dependent pronominals (it’s not just demonstratives) with heads, you can even have compounds with actual phrases as elements, like

[anzurifa ne salima la’]maan “[silver and gold item]-maker”

Of course you can do this in English too: “Silver- and goldsmith.” So you need to get into a whole typology of compounding, and how it works differently in different languages (and isn’t just one single undifferentiated process even within a single language.) I think a lot of work has been done on this; it overlaps with the whole polysynthesis thing, and the question of when and whether subunits of “words” can be accessible to the syntax.

So even though this universal isn’t universal, thinking about it is illuminating and helpful. It comes down to Brett’s point; these “universals” actually are interesting and significant in themselves; the problem comes with interpreting them as absolute laws instead of “attractors”, and above all when you take the further step of imagining that they are absolute laws which can be leveraged into giving you privileged insights into how the mind must work.
languagehat says

October 30, 2018 at 8:37 am

Well, duh, but I didn’t expect him to have anything on linguistics.

Никто не обнимет необъятного!
juha says

October 30, 2018 at 9:11 am

the question of when and whether subunits of “words” can be accessible to the syntax.

I’m reminded of 極めて美人 (kiwamete bijin, k.= extremely, exceedingly, b.=beautiful woman, made up of “beauty + person”), where kiwamete modifies bi-:

彼女が極めて美人だということは間違いない。
Kanojo ga kiwamete bijin da to iu koto wa machigai nai.
It’s beyond doubt that she is an extremely beautifulwoman.
David Marjanović says

October 30, 2018 at 9:17 am

How can David E. tell for Kuusal? Prosodic units and separate bound forms of the head word?

Prosody will get you nowhere in French, which doesn’t seem to have prosodic units between the syllable and the whole utterance. Stress for instance is simply prepausal.
David Marjanović says

October 30, 2018 at 9:19 am

In the original Chinese, 美 is an adjective.
juha says

October 30, 2018 at 10:00 am

In the original Chinese, 美 is an adjective.

As in “beautiful country.”
dainichi says

October 30, 2018 at 2:18 pm

> kiwamete modifies bi-

美人な can be an adjective in Japanese, and in the example, 極めて can be analyzed as modifying the whole of it, so I’m not convinced. “It’s beyond doubt that she’s extremely beautiful”.

I don’t think it works when 美人 is indisputably a noun, as in
*極めて美人を見かけた
“I saw an extremely beautifulwoman”
bulbul says

October 31, 2018 at 6:26 am

DavidM,

Prosody will get you nowhere in French, which doesn’t seem to have prosodic units between the syllable and the whole utterance.
Some people would disagree.

DavidE (damn, so many Davids, not a single Goliath),
Logorrhoea? Niche interest? That’s what we DO here. Welcome home, son.

As for universals, well, they tend to come in two types: first, there’s the hardcore Generativist UG Principles and Parameters type, which, you know, fuck that shit. Then there are Greenbergian universals which tend to me more distributional and statistical, hence their second designation as statistical universals. Their existence is pretty much accepted by everyone, even generativists (see e.g. Martin Haspelmath’s comment on “Greenberg’s empirical universalist programme”). Whether they should be called universals rather than something else is a different problem…
bulbul says

October 31, 2018 at 7:04 am

Etienne,

as an Arabic scholar who is a native speaker of a Slavic language your bias, when it comes to evaluating the complexity of Arabic and Russian, is pretty much a mirror image of his
I object to that characterization, not in the least because unlike McWhorter, I have not made any claims about the horrific complexity or simplicity of any language (unless pointing out someone fullofshitness). This is mainly because, and this ties to your next question, when it comes to language, complexity is a bullshit concept. People usually tend to focus on morphology or phonology, because those are the most obvious examples, but those are also things that matter the least. To give you an example, Finnish and Hungarian may have more cases than Russian or Slovak, but their formation and their use is also much more regular.

how difficult is Russian from the vantage point of a Slovak L1 speaker? The stress system must be a major hurdle: are there others?
In general, I’ve found Slavic languages a little more… difficult to acquire than those unrelated to my L1(s). The major reason for this is the similarity which makes understanding, say, Russian or Polish or Serbian much easier, but when trying to acquire active competence, all the similarities make the job harder.
But, as I keep saying when asked, it usually does not matter whether a language is similar to another one you already know, the most important things – the words, the way they combine into phrases and the way people say things – you always need to learn that from scratch.
SFReader says

October 31, 2018 at 7:36 am

I’ve seen native Nama speakers who managed to become fluent in Russian.

That’s about the furthest it gets from Russian in terms of linguistic proximity.

I don’t know why English speakers should find it impossibly difficult – it’s not.
David Marjanović says

October 31, 2018 at 8:09 am

Some people would disagree.

Thanks, I’ll read that ASAP.
Etienne says

October 31, 2018 at 10:07 am

Bulbul: On the matter of linguistic complexity, when it comes to your claim that

“Finnish and Hungarian may have more cases than Russian or Slovak, but their formation and their use is also much more regular.”

-Am I to assume, then, that you would accept that a language with a large number of cases with elaborate rules of usage and a high degree of allomorphy with regards to the case markers themselves is indeed more complex than a language lacking case marking altogether?

And to anticipate an objection: Yes, there indeed is more to a language than nominal inflection, but if I may echo something McWhorter wrote once, there is no logical reason to assume that all languages are equally complex and that complexity in one subsystem will be somehow “balanced out” by simplicity in another.

On Slavic languages being harder to learn because of their similarity to Slovak: I’m reminded of the repeated claim I have heard that it is rare indeed for Portuguese speakers ever to thoroughly master Spanish, or vice-versa, apparently for the same reason. Interestingly enough, from the vantage point of linguistic complexity, it seems to be unanimously accepted that Portuguese is significantly harder than Spanish (I’ve seen this claim repeatedly, and have never seen anyone claim the reverse): Portuguese has substantially more phonemes than Spanish (voiced fricatives, mid-low vowels, nasal vowels…), and the two languages, phonology aside, are so similar that there don’t seem to exist any linguistic subsystems which are, or indeed even could be, substantially simpler in Portuguese than in Spanish to such a degree that their presence might thus “balance out” the undeniably greater complexity of the Portuguese phoneme inventory.

SFReader: Okay, Nama speakers who became fluent in Russian. Why do I sense there is a very interesting background story to be told here?
John Cowan says

October 31, 2018 at 10:11 am

such as that the Old English w-letter was called wen

I suspect an avatar of the Idiot Copy Editor God looked in his dictionary, couldn’t find wynn, and substituted wen ‘sebaceous cyst’, which however unusual was at least in the dictionary.

[anzurifa ne salima la’]maan “[silver and gold item]-maker”

On Language Log, Geoff Pullum debated with himself on whether phrases like surface and submarine warfare, which appear to coordinate a word with a prefix (since the warfare in question was naval in either case), were actually grammatical. He starts out with an unequivocal no, but eventually admits that these sound better the more examples you see.
languagehat says

October 31, 2018 at 10:17 am

phrases like surface and submarine warfare, which appear to coordinate a word with a prefix

So the claim is that it’s short for “surface marine and submarine warfare”? That’s absurd on the face of it, and thus the debate is silly.
David Marjanović says

October 31, 2018 at 10:22 am

French prosody: unsurprisingly, my use of “utterance” was a foggy conflagration of several distinguishable things, but none of them quite seems to be a “word”. In Table 4 of the paper, the “rhythmic groups” come close, but to regard que vous soyez devenue as a single word requires some serious polysynthesis with incorporation.

(I’m also surprised by the absence of a level that groups the example into two, uh, groups, with the boundary between vedette and vous.)

SFReader: Okay, Nama speakers who became fluent in Russian. Why do I sense there is a very interesting background story to be told here?

Namibia casting a hopeful eye toward Socialism® and sending a few students to Moscow. See also: people from Mozambique and Mongolia becoming fluent in German because they studied in East Berlin right around the same time (1960s/70s).

On Language Log, Geoff Pullum debated with himself on whether phrases like surface and submarine warfare, which appear to coordinate a word with a prefix (since the warfare in question was naval in either case), were actually grammatical.

I hesitate to argue with either version of the author of the Cambridge Grammar of the English Language… but… from my German background at least, I can’t see how this example is a “phrase” and not a branched compound noun just like silver- and goldsmith. What differences are there other than the spelling?

Part of the issue may be that there doesn’t seem to be a way to tell if submarine is a noun-turned-prefix or an adjective here. But even with an adjective in the second branch this kind of thing seems to be grammatical in German.
David Marjanović says

October 31, 2018 at 10:28 am

So the claim is that it’s short for “surface marine and submarine warfare”?

Oh, I thought the claim is that it’s short for “surface warfare and submarine warfare”.

But then, branched compound adjectives are common in German, and I’d say I’ve seen them in English as well: how about “super- or substratal”?
languagehat says

October 31, 2018 at 10:48 am

Oh, I thought the claim is that it’s short for “surface warfare and submarine warfare”.

It is, but that obvious reading doesn’t account for “which appear to coordinate a word with a prefix.”
SFReader says

October 31, 2018 at 10:59 am

Okay, Nama speakers who became fluent in Russian. Why do I sense there is a very interesting background story to be told here?

Just a Namibian student of Patrice Lumumba university in Moscow I’ve met. He wasn’t as black as I thought the Africans were (yes, I know it was awfully tactless thing to ask, but I was pretty young and drunk at the time) and said that he belonged to the Nama people.
Y says

October 31, 2018 at 11:50 am

Surface-to-air and air-to-air missiles.
Ear, nose, and throat doctor.
Aidan Kehoe says

October 31, 2018 at 12:43 pm

“But, as I keep saying when asked, it usually does not matter whether a language is similar to another one you already know, the most important things – the words, the way they combine into phrases and the way people say things – you always need to learn that from scratch.”

What? Going from English to French and learning that the word for ‘hierarchy’ is « hiérarchie » is in no way starting from scratch. And a random German word as a counterexample; »die Wonne« would be « la joie, le plaisir, le délice, la volupté » in French; I can link all of the French words to an English word with a related meaning, which is a massive help as a mnemonic, I can’t link the German word to any English word (the OED tells me the related English word is obsolete, with the most recent citation from a dialect dictionary published 1700). So learning the German word is intrinsically more work than learning any of the French ones.
David Eddyshaw says

October 31, 2018 at 4:13 pm

Ear, nose, and throat doctor.
vs
anzurifa ne salima la’maan “silver and gold item-maker”

The difference here is that “throat doctor” etc in English is not necessarily a “compound” (in fact most English speakers would deny that it is; German speakers, mutatis mutandis, would probably assert that it obviously is.) This is the sort of thing I had in mind when talking about a “typology” of compounding, and how even within a single language compounding is not some all-or-nothing phenomenon but variable in tightness of syntactic binding, phonological binding and other respects.

The Kusaal example illustrates this, in fact: the la’- component of la’maan is unequivocally bound (cf sg lauk “piece of goods”, pl la’ad) and this is absolutely regular for any noun used as a generic argument to a deverbal noun (here maan “maker.”) However, with nouns used as generic premodifiers there is a difference between count and mass nouns. With count nouns you have e.g. widzuur “a horsetail”, contrasting with wief zuur “a horse’s tail”, but with mass nouns expressing the material from which something is made you use an unbound sg/pl form as a premodifier: salima wief “a golden horse”, from salima (noun) “gold.” And in fact you can use even a whole noun phrase like anzurifa ne salima “silver and gold” that way. The peculiar thing is that this binds tighter than the relationship between the two components of la’maan, which it would seem perverse not to call a compound …

And it’s not that mass nouns expressing substances just don’t have combining forms: they do, not only (as always) as heads (salimkanga “this gold”) but as generic arguments to deverbal nouns: salimkuos “gold merchant” (vs *salima kuos “a merchant [made] of gold.”)

[Incidentally, it’s not possible to account for anzurifa ne salima la’maan by supposing it is an ellipted form of something like anzurifa la’maan ne salima la’maan “a maker of silver goods and a maker of gold goods” because ne cannot join two NPs with the same referent in apposition: anzurifa la’maan ne salima la’maan would have to be two separate craftsmen.]
David Eddyshaw says

October 31, 2018 at 4:25 pm

The go-to example for syntax getting at word fragments is the Eskimo languages (as I expect everybody knows without my telling them); Jerrold Sadock has written a lot about this, and (fine man) made a lot of it publicly available:

https://chicago.academia.edu/JerrySadock

(see “Noun Incorporation in Greenlandic”, for example)
bulbul says

October 31, 2018 at 4:39 pm

Nama speakers who became fluent in Russian. Why do I sense there is a very interesting background story to be told here?
There are also a lot of students from Subsaharan Africa in Slovakia and the Czech Republic, largely as a result of agreements made before 1989. I had a roomate first year in college who was a native speaker of Yoruba and spoke great Slovaka and a native speaker of Fulfulde from Mali who came here in the late 1980s and learned Slovak is even a minor celebrity in our parts.
ktschwarz says

October 31, 2018 at 4:52 pm

surface and submarine warfare: I couldn’t find that phrase in Language Log, or with “Pullum” anywhere else on the web. Do you have a link or more information?

“surface and submariners” is tossed out in The Great Eskimo Vocabulary Hoax, as a comparison to a book title Nuclear and Radiochemistry, but I don’t think that qualifies as debating with himself.
David Eddyshaw says

October 31, 2018 at 4:59 pm

I worked with a Russian-speaking Ghanaian doctor at one stage (he’d studied in Moscow, under fraternal arrangements going back to Nkrumah’s day.) His Russian wife, also a doctor, had come with him to Ghana in search of a better life.

There a quite a few Ghanaians who are alumni of German-speaking institutions, too.
David Marjanović says

October 31, 2018 at 5:13 pm

wief zuur “a horse’s tail” […] salima wief “a golden horse”

“A horse of gold”…?

And why do you say that binds tighter than the obvious-looking compounding?
David Eddyshaw says

October 31, 2018 at 5:30 pm

“A horse of gold” indeed; and in fact “gold” can be picked up by a pronoun (unlike a dependent combining form), so it’s generic, but it’s not non-referential:

salima la’ad ne o butiis “gold items and [gold] cups”

where o is a 3sg personal pronoun (of the “wrong” gender, but that’s another story …) My main informant spontaneously produced this a correction for my salima la’ad ne butiis, which has to mean “[gold items] and cups”, not “gold [items and cups].”

The (syntactically, not phonologically) tighter binding is evident in e.g salima la’maan, which is “a maker of golden stuff” not “a golden maker of stuff”: [salima la’]maan.

This isn’t true of cases like wief zuur “a horse’s tail”, so it’s not possible just to treat expressions like salima wief simply as syntactically parallel to the “possessive” type; you could, I guess, just specify that difference directly, though.
David Eddyshaw says

October 31, 2018 at 6:07 pm

Having said that (ran out of time to edit my previous comment) I’m not sure that I can demonstrate that there really is a difference; semantic issues are going to make it awkward to try to say something like “a braider of [a specific but indefinite] horse’s tail.” A “braider of horsetails” would be a three-component compound beginning with the combining form of “horsetail”, widzu-. (Don’t know the word for “braid”, sorry!)

I’ve never (so far) been able to come up with a neat theoretical way of tying this all up, and in the Tractatus Cusalicus I basically just list the construction types. It’s probably worth my thinking a bit more deeply about the fact that “material” premodifiers are referential; perhaps one could even claim that they are not actually generic, in the sense that if you’ve got a gold cup there, you also have some actual gold in front of you; whereas a gold merchant (salimkuos) would still be a gold merchant even if he didn’t actually have any about his person.

At this point in my meditations I usually get a migraine and have to lie down a bit, though.
David Marjanović says

October 31, 2018 at 7:12 pm

Fascinating. Sleep well!
J Pystynen says

October 31, 2018 at 9:22 pm

to regard que vous soyez devenue as a single word requires some serious polysynthesis with incorporation

I do fondly regard the ha-ha-only-serious theory, advanced a few times on the internets, that modern colloquial French is in fact polysynthetic. (Must be due to the Cree substrate in Quebec, of course.)
John Cowan says

October 31, 2018 at 11:16 pm

“surface and submariners” is tossed out in The Great Eskimo Vocabulary Hoax, as a comparison to a book title Nuclear and Radiochemistry, but I don’t think that qualifies as debating with himself

That was it, yes. The other cases he mentions and asterisks are Euclidean and hyperspace, descriptive and psycholinguistics, but then he says “Hey! These are starting to sound not quite so bad.” This is in a list of non-constituent and ungrammatical constituent titles: obviously it is not the first, so he must start out thinking it’s the second, but by the end he is beginning to change his mind.

As for me, I didn’t need to change my mind in the first place: these constructions sound fine to me.
ktschwarz says

November 1, 2018 at 2:01 am

I think it’s only fair to Pullum to note that he didn’t bring “warfare” into it, and that in context it was intended to be a bit silly. In fact, though he couldn’t have known in the 1980’s, all three of those examples are now out there on the web.

Also, Wiktionary is hilarious on submariner:

This word is generally pronounced like sub- + mariner (for example, in the U.K. Royal Navy and the U.S. Navy); however, since the prefix sub- was apparently deemed to imply inferiority (as in subpar or subhuman) rather than the actual meaning of “under,” this pronunciation may be considered offensive by non-submariners. The pronunciation submarine + -er, but with stress on third syllable, is preferred by Naval Brass. As evidence of submariners’ collective lack of concern for the opinion of non-submariners on any matter, many submariners refer to themselves by the much more negative terms of “sewer-pipe” sailor, or “bubble-head.” Submariners often refer to sailors that work on the surface of the ocean as “skimmers” or “targets.”
Brett says

November 1, 2018 at 2:55 am

The title Nuclear and Radiochemistry sounds especially natural to me. I think this is, in part at least, because it embodies a particular kind of hendiadys that comes up sometimes in the sciences. In a modern context, “nuclear chemistry” and “radiochemistry” mean essentially the same thing, and the terms are about equally common. (The Google Ngrams viewer shows “radiochemistry” taking off a little earlier, which makes sense, since radioactivity was discovered by Becquerel in 1896, while the nucleus was not discovered by Rutherford, Geiger, and Marsden until 1909; and the connection between the two was not understood until later yet. There is a brief tripling of the rate of appearance of “radiochemistry” around 1960, which does not have so obvious an explanation.)

So a title like Nuclear and Radiochemistry might seem redundant, but it seems like a very natural title, even so. One of the most important journals in astronomy is called Astronomy and Astrophysics, even though by the time the journal was consolidated out of a number of predecessor publications in 1969, astronomy was not seen as a discipline separate from experimental astrophysics. (Hubble, the world’s preeminent astronomer during his lifetime, was a prominent voice in favor of considering astronomy as a subfield of physics.)
David Eddyshaw says

November 1, 2018 at 6:36 am

Just had some further thoughts about salima butiis “gold cups” etc.

That source of all wisdom the Cambridge Grammar of the English Language uses “generic” in such a way that “referential generic” is pretty much a contradiction in terms. Previously, I thought that this couldn’t be quite right in Kusaal, not only because of cases like salima la’ad ne o butiis “items of gold and cups of it” but also because nouns used in generic statements like proverbs can be picked up by pronouns:

Bung ya’ bood ye o lubuf fu pu nyet o tubaa. “When a donkey wants to throw you off you can’t see his ears.”

Tumtum pu gat o zugdaana. “A servant does not not surpass his master.”

However, CGEL’s diagnostic is more subtle than I thought (p400); it says that “the difference can be brought out by testing whether a coreferential personal pronoun can be added in a separate main clause“; I’d overlooked what is clearly a major point. The salima la’ad ne o butiis construction similarly wouldn’t prove that salima “gold” is referential (always assuming that the English categories can be legitimately applied to Kusaal.)

CGEL does briefly mention anaphora in cases where the antecedent is not referential (p1458) and says (somewhat unhelpfully IMHO) “the anaphoric relation here can can best be described in terms of a bound variable.”

Be that as it may, I should probably concoct a snappy term for “not-referential-but-nevertheless-capable-of-being-the-antecedent-of-anaphora-within-the-same-clause.” However, I’m not sure how much mileage there is in assimilating mass nouns used to describe the material of which something is made to nouns used in generic statements; there doesn’t seem to be a great deal of natural semantic resemblance between the cases.
David Eddyshaw says

November 1, 2018 at 7:27 am

Correcting my error will have benefits for the description, though (apart from making it, like, less erroneous.) I should be able to describe NP dependents just using referentiality as a criterion instead of inconsistently using referentiality and genericity. It should make things simpler, and hopefully clearer.
David Marjanović says

November 1, 2018 at 7:28 am

I do fondly regard the ha-ha-only-serious theory, advanced a few times on the internets, that modern colloquial French is in fact polysynthetic.

I’m convinced of it, and I think it’s been brought up in primary literature, too; not just for French, but also for Catalan. The inability of most “personal pronouns”, or even such strings (whose “word” order is fixed like the slots in a polysynthesis template!) as je te ne le que, to occur alone or anywhere except in front of a verb, certainly argues that they’re prefixes. So, I have no problem with que vous soyez as a single morphological word.

(This may even be a more or less inevitable outcome of having obligatory personal pronouns and a largely fixed word order. My German dialect does similar things on the other end of the verb if the verb happens to precede unstressed personal pronouns – but change the word order, and suddenly you find that they still count as words for word-order purposes, e.g. cannot follow the verb in a verb-final clause, and are therefore just clitics.)

But to add the participle devenue as part of the same word does stretch things. First, even Basque doesn’t do that; it routinely combines polysynthetic auxiliary verbs with uninflected content words. Second, you can put adverbs and whole clauses between soyez and devenue.
Etienne says

November 1, 2018 at 9:36 am

David M.: “le”, “la” and “les” are actually possible post-verbally, after an imperative form (“Dis-le!”), whereas the other clitic pronouns are replaced by their strong, i.e. stressed, counterparts (Dis-moi, Dis-toi…), even in clitic combinations (“Il me le dit” versus “Dis-le-moi!”). As can be seen, the post-verbal string of pronouns does not share the same order as the pre-verbal string, so that one could claim that the post-verbal pronouns “le, la, les” are members of a distinct set, different from the pre-verbal clitics.

A piece of evidence in support of such an analysis comes from some basilectal varieties of Quebec French, spoken inter alia in Montreal: In these varieties post-verbal “le” is often realized as /le/ instead of /l(ə)/, so that “Regardez-le” and “Regardez-les” are homophonous: /ʀ(ə)gaʀdele/. Interestingly, this realization of “le” is quite unknown when it is pre-verbal, which does suggest that we are dealing with two separate pronouns.
David Eddyshaw says

November 1, 2018 at 6:14 pm

You know (more rambling on Kusaal, but I did warn you) CGEL’s treatment of genericity and non-referentiality doesn’t stand up too well. If you count generic interpretations of NPs within the scope of expressions denoting unlimited states as non-referential (which they do, pp406-407):

Lions are ferocious beasts.

there is nothing ungrammatical about continuing the zoology lecture with a separate main clause:

Lions are ferocious beasts. They have been celebrated since Roman times for their love of hip-hop, and for the most part speak Dutch. It is unwise to speak to them in Welsh.

There’s more than one way to be “generic”; this sort plainly is, in some sense “referential”; or at least “referential” needs to be very clearly distinguished as a concept from “can be the antecedent of an anaphoric pronoun”, and not just in the way CGEL states.
There may, after all, be an analogy with mass nouns expressing materials:

Gold is a yellow metal. Though prized by the foolish and greedy, it has no intrinsic value.

Abstracts like “love” and so forth are the same (and can be construed in the same way in Kusaal NPs.)
Just to confuse matters, Kusaal doesn’t use its definite article la for “things belonging to common background knowledge” like English does, so where English says “the sun”, Kusaal has just winnig. So it’s conceivable that salima “gold” in salima butiis “gold cups” is not only referential but specific after all. Hmmm …
David Marjanović says

November 1, 2018 at 6:34 pm

As can be seen, the post-verbal string of pronouns does not share the same order as the pre-verbal string, so that one could claim that the post-verbal pronouns “le, la, les” are members of a distinct set, different from the pre-verbal clitics.

The reason I didn’t think of those is precisely that your analysis seems obvious to me. 🙂 I had no idea of le as [le], though.

and for the most part speak Dutch.

Lions of Flanders after all.

so where English says “the sun”, Kusaal has just winnig.

Could that be considered a proper name?
David Eddyshaw says

November 1, 2018 at 6:59 pm

Proper names of people are distinctive in Kusaal, as always having a preceding particle A or N, and proper names of places have characteristic syntax; I’m not sure if I can think of a diagnostic for other proper names, given the way the article is used.

Apropos of that, poking around the intertubes, I see there is an often-cited thesis by GN Carlson which analyses English “bare plurals” used generically as proper names of kinds.

I suppose names of elements are proper names, come to think of it. Uranium. Gold. Earth. Air. Fire. Water.
Trond Engen says

November 1, 2018 at 7:32 pm

I don’t know how mcuh sense anything should be expected to make in the end. Everything grammatical ever developed through human reinterpretation of random processes and is somewhere between a neat discription and an equally neat but contradictory description.

It seems to me that the Kusaal collectives are abstracts or terms of properties more than names of physical objects or materials. The compound forms are concrete, but the line may be blurry, since they are essentially the same words used in lexicalised compounds: The difference between full and compound forms was originally pragmatic but was grammaticalized by compound shortening. The full form came to be perceived as lexically abstract because it would be used contrastively to the compound form.

In English a gold (pause) smith is a smith of gold, while a goldsmith (golsmith, gowsmith, …) is a smith working with gold. A house (pause) wife is a wife living in a house or something (or a wife playing house music), while a housewife (huswif, hussif, …] is a married woman caring for home and children.
David Eddyshaw says

November 1, 2018 at 7:41 pm

@Trond:

True, of course. Still, Kusaal is a bit different because of the way that compounding is so all-pervasive and productive in the NP, so an adequate grammar of the language can’t just relegate all that stuff to the lexicon or to etymology.

With this particular issue, I’ve long had the feeling that there’s some unifying insight just fractionally beyond my grasp … if so (though it may well be an illusion), I suspect what’s stopping me from getting there is not lack of grammatical creativity but the interference of some preconception(s) that I haven’t noticed yet.
AntC says

November 2, 2018 at 6:42 am

Everything grammatical ever developed through human reinterpretation of random processes and is somewhere between a neat discription and an equally neat but contradictory description.

the feeling that there’s some unifying insight just fractionally beyond my grasp …

Since languages are constantly evolving, it’s at least possible that both the neat description and the equally neat contradictory description represent equally ‘correct’/’consistent’ accounts — one of where the language has just come from, the other of where it’s nearly got to. (Imagine trying to capture English phonology slap in the middle of The Great Vowel Shift.)

But the language never arrives at a stable equilibrium: as soon as one change is established, that’s in tension with another bit — which starts to change. I’m thinking of David M’s comment on the FacultyofLanguage blog about a calque from Latin into English but not in other Germanic languages. That’s now shunted some other stuff around.

Our monkey brains seem to have tremendous capacity to follow all sorts of ad-hoc rules, and yet not cope with embedding more than three levels deep. I can see no model of FL that correlates with that. So UG has no explanatory power.
David Eddyshaw says

November 2, 2018 at 8:09 am

While it is certainly true that “all grammars leak”, it doesn’t follow that there are no regular patterns waiting to be discovered (and indeed, if the patterns weren’t in some sense real, there would be no sense in talking about “leaks.”) I have a low ontological view of the status of grammatical rules, but that doesn’t mean I think they’re just illusions or handy mnemonics. I’ve several times had the experience of realising that aspects of Kusaal grammar which initially seemed arbitrary and unconnected were in fact consequences of some common underlying principle, and moreover that this insight could lead to a considerable simplification of my description. I don’t (contrary to what I more-or-less implied above) think that the choice between competing theories which seem to account for all the data is just a matter of aesthetics; the theory with fewer epicycles is going to be nearer the “truth”, even if the “truth” will never fully accessible to us in this sad sublunar sphere. In this respect, linguistics is no different from any other science.

What bothers me about UG and its cousins is not so much that I think its objectives are doomed a priori on philosophical grounds but that I think it makes seriously premature and undersupported claims about the significance of often, let’s face it, fairly peripheral grammatical constraints and regularities in shedding light on the fundamental basis of human cognition. It’s very much the same problem as I have with the long-range comparativists: it would be wonderful if Dravidian could be proven to be related to Indoeuropean (say), and indeed it may very well be, but the evidence is not there and pretending otherwise is basically cheating, declaring you’ve proved a result without all the tedious legwork involved in actually doing so.

Languages certainly do change over time, of course, but not from (relative) order to chaos, but from one order to another, according to principles which themselves are not arbitrary. There was surely never a time during the Great Vowel Shift when it would have been impossible to describe the vowel phonemes of English consistently.
languagehat says

November 2, 2018 at 8:27 am

I’ve several times had the experience of realising that aspects of Kusaal grammar which initially seemed arbitrary and unconnected were in fact consequences of some common underlying principle

That is both one of the most satisfying things about linguistics and one of the best pieces of evidence that linguistics is a science. And in general I would like to heartily subscribe to everything you say about rules, language change, and everything else in that comment.
David Eddyshaw says

November 2, 2018 at 10:23 am

In re the Great Vowel Shift: that’s nothing! In most branches of Oti-Volta, the original low tone is now the high tone.

Unlike the case with the Great Vowel Shift, there is of course no documentary evidence as to how this happened, but pure logic dictates that there must have been an intermediate stage where the opposition of high and low tones turned into something else (level and oblique?) and then turned back again into a typical West African terracing system with H and L but the other way round from the starting state. Whatever happened, the languages must have preserved the contrast between the “H” and “L” tonemes (however actually realised) at every stage.

One of the counterintuitive things about at least terracing tone systems (of the sort that Africa loves) for a native speaker of a non-tonal language is that tones are both diachronically and synchronically a lot more stable than vowels and consonants. (It’s how all those funky ideas about autosegmental phonology got started.)

There seems to be a general principle at work in language change over time, in fact: the underlying system is often largely preserved even when the actual expression of the system changes. It’s in this sense that Germanic is actually phonologically conservative within Indoeuropean, for example, despite/because of Grimm’s Law. A classic example in syntax is P’s insights into the verb system of Ancient Egyptian: he deduced that the opposition of the Coptic “first” and “second” tenses reflected similar oppositions made by Middle Egyptian, even though the actual grammatical machinery used to mark the oppositions had changed completely.

So there’s even a sense in which grammatical rules can be more real (or at least more durable) than the language that instantiates them.
David Eddyshaw says

November 2, 2018 at 10:40 am

P = Polotsky. (I wrote “Pokorny” and ran out of time before I could correct it properly. That’s what comes of talking about Indoeuropean.)
bulbul says

November 2, 2018 at 11:19 am

Speaking of what I now realize is my favorite thing about all the generativist theorizing, i.e. hedging, this conversation. Adger is all “maybe”, “could”, “perhaps”, “whatever”.
bulbul says

November 2, 2018 at 11:25 am

Jesus H. Christ:

I can’t really speak for Maori, but if I take the work that Daniel Harbour and I did on Kiowa, I think we got a lot of understanding of the language by adopting a certain set of hypotheses about categories and their organization (there’s V, v, Appl, Asp, Neg, Modality and Evidentiality). They are probably wrong (isn’t everything?) but they allow us an advance in our understanding, both of the phenomena, and of the theory.

So their hypotheses is wrong, well, they are not sure, but they are sure it advances their understanding. Riiiight. Or, in the words of Paul Postal:

Again, suppose you are an advocate of some popular linguistic theory and are working on an exotic NL (one not used by European settlers of the thirteen Ameri-can colonies) and you uncover a neat analysis of some sentences that is unfortunately inconsistent with some principle of the linguistic theory of which you are a vocal defender. This could, unpleasantly, force you to think about which to give up: (i) the theoretical principle; (ii) the analysis; or, boldly, (iii) logic. Obviously, (i) could annoy the many, often illiberal, defenders of the theory, (ii) would waste a lot of your time, and (iii), although not to be excluded a priori, is going to raise some eyebrows even in linguistics. Happily, there are alternatives. Instead of getting rid of any of (i)–(iii), you can simply say that A only violates the letter of the principle but not its spirit.
J.W. Brewer says

November 2, 2018 at 12:02 pm

So once again I feel vindicated in hindsight by having been so distracted by the various non-academic pleasures of undergraduate life to have paid virtually no attention at all in that Syntax I class in the fall of ’85, which was largely, but not entirely, dedicated to the rudiments of some then-current version of Chomskyanism that was in many respects subsequently abandoned or modified beyond recognition by the Chomskyists themselves.

I do think that’s where I first focused on the minimal quartet: run up a big bill, run up a big hill, run a big bill up, but *run a big hill up. Which is interesting and requires some explanation, although probably not the one being offered at the time.
David Eddyshaw says

November 2, 2018 at 12:18 pm

It’s an interesting conversation. I think much of the hedging is tact-related rather than obfuscation, though.

Adger seems prepared to accept that more or less all of the actual content of traditional UG apart from exceedingly abstract things like “merge” (and maybe even that) might indeed arise from contingent processes to do with learnability and processability and general human cognitive capacities (his point that we know more about language than other aspects of human intelligence, so that it’s difficult to argue from them to language rather than vice versa, is interesting.)

Haspelmath’s main objection to this assuredly minimalist sort of UG seems to be that although that is a reasonable theoretical position, in practice devotees of UG tend to assume that they have privileged access to how languages must work on first principles, and that this makes them very prone to imposing their preconceptions on the data. The evidence is pretty clear that he’s right, but I think Adger is also right that this sort of behaviour is not a necessary logical consequence of the theoretical position he holds.

So far as there is hedging going on, I think it’s in the vagueness of the connexion between all the principles-and-parameters and government-and-binding stuff which the initiates do seem usually to assume to be transcendentally true, and the minimalist abstract core, which is difficult to object to (which may be the point of it.)

I notice that Adger makes several appeals to learnability; as I understand it there have been some pretty convincing debunkings of most poverty-of-stimulus arguments, but I’m far from well up on that. And I suppose he may mean more “processability.”

Whether Adger’s comments about his Kiowa work are execration-worthy depends on the degree to which he would in fact be prepared to discard hypotheses which are contradicted by the data (you’ve got to start with hypotheses, and it’s reasonable to start with a framework which is familiar both to you and to others.) I must admit that the formulation “advance in our understanding, both of the phenomena, and of the theory” does not sound like a commitment to abandon a theory if it proves defective; however, if he really does mean by “the theory” no more than the essentially content-free minimalist core, he could effectively mean that. One would have to read the work in question to see. (Kiowa is interesting.)
David Eddyshaw says

November 2, 2018 at 12:48 pm

I believe NC himself prefers to call the latest iteration of his thing a Program, rather than a Theory, presumably in a nod to Karl Popper’s characterisation of Darwinism as a “metaphysical research program”, to sidestep the fact that it isn’t falsifiable in the relevant Popperian sense. Popper was famously displeased by the spin given to this by creationists, and at great pains to say that he by no means intended to imply that Darwinism was unscientific.

Anyhow, one can see the attraction of the rebranding.
AJP Crown says

November 2, 2018 at 3:46 pm

A house (pause) wife is someone who married a house.
David Marjanović says

November 2, 2018 at 8:03 pm

It’s very much the same problem as I have with the long-range comparativists: it would be wonderful if Dravidian could be proven to be related to Indoeuropean (say), and indeed it may very well be, but the evidence is not there and pretending otherwise is basically cheating, declaring you’ve proved a result without all the tedious legwork involved in actually doing so.

…Which long-range comparativists, given the enormous differences between them? Ruhlen has done just the very first step of the legwork, others have gone much farther and run into different problems. And why do you suddenly present proof as binary right after stating that “linguistics is no different from any other science”?

So their hypotheses is wrong, well, they are not sure, but they are sure it advances their understanding. Riiiight.

All that “probably wrong” stuff strikes me just as the usual scientific self-deprecation – wrong in the way that both the general theory of relativity and quantum physics have to be wrong somehow. The context seems to fit that: right after what you quote, Adger goes on with

So we used our analysis of Kiowa clause structure to argue against particular hypotheses in generative grammar (that there is roll-up movement), and for others (that the relationship between morphological complexity and syntactic height is looser than most current theory would predict).

Sounds like science to me, even if I have no idea what “roll-up movement” might mean and can only guess at “syntactic height”.

Karl Popper’s characterisation of Darwinism as a “metaphysical research program”, to sidestep the fact that it isn’t falsifiable in the relevant Popperian sense

…What’s supposed to be metaphysical about it…?

It’s falsifiable against parsimony (“a rabbit in the Precambrian”). A lot of Popperian falsification is really nothing else than that anyway.
David Marjanović says

November 2, 2018 at 8:34 pm

Funnily enough, The Loom of Language and French polysynthesis have been discussed in the same thread here before, starting here! Much merriment follows, all the way to ß. All glory to the Random Link.
languagehat says

November 2, 2018 at 9:00 pm

Thanks, that was a fun thread to revisit.
David Eddyshaw says

November 2, 2018 at 9:24 pm

why do you suddenly present proof as binary

For rhetorical effect, merely. I repent in dust and ashes.

…What’s supposed to be metaphysical about it…?

Ask Karl. Though I suspect that, like the great majority of philosophers, he would not have regarded the word “metaphysical” as synonymous with “nonsensical”; in this particular context, I imagine he meant it in the precisely literal sense of “meta-physics.” Or “meta-biology”, anyhow.

I must confess that my knowledge of KP is pretty much by popcultural osmosis. I think I’ve got the basics right, more or less, but know very little about the detail. My impression (for what very little it’s worth in the circumstances) is that his methodology represents an idealisation of how one might perhaps wish that science work rather than a helpful guide to how it really does work.

I did read The Logic of Scientific Research some decades ago. It doesn’t linger in my memory. But then I’m with Wittgenstein when it comes to pokers.
David Eddyshaw says

November 2, 2018 at 9:59 pm

why do you suddenly present proof as binary

… and, to be honest, probably because I’m just a natural-born splitter. I strongly suspect that the splitter/lumper divide has more to do with Myers/Briggs than actual data.

If it’s any consolation, I don’t believe in Niger-Congo, either. Even.
David Marjanović says

November 3, 2018 at 9:40 am

That isn’t binary either: Dimmendaal accepts a slightly reduced version of Niger-Congo, illustrated in the handy map on p. 8 of this paper.
David Eddyshaw says

November 3, 2018 at 11:38 am

Quite so; basically there is no good evidence that Mande belongs; and with Khordofanian and Atlantic there are strong typological similarities to the core groups but not a lot more.

Atlantic is so internally diverse that it’s not easy to demonstrate that its own subgroups are genetically related to each other, let alone farther afield.

I think it’s beyond reasonable doubt that “Volta-Congo” is a genetic group, though. Which is a pretty impressive family to be going on with. Tom Güldemann uses “Niger-Congo” to mean just that.
David Eddyshaw says

November 3, 2018 at 11:49 am

Dimmendaal is a counterexample to my psychological theory of lumper-splitterism, incidentally. He’s a lumper with Nilo-Saharan, about which there has been a lot of doubt in many quarters all along, and a splitter with NIger-Congo, which was generally accepted early on (although I get the impression that the consensus with regard to Niger-Congo is shifting away from Greenberg’s maximalist picture.)
David Marjanović says

November 3, 2018 at 3:36 pm

Atlantic is so internally diverse that it’s not easy to demonstrate that its own subgroups are genetically related to each other, let alone farther afield.

Interesting.
Y says

November 3, 2018 at 5:06 pm

Another lumper-splitter is Roger Blench, who as I recall is open to subsuming most of Niger-Congo into Nilo-Saharan, yet is very ready to declare certain languages isolates, in Africa and elsewhere. I am not in a position to have any ideas about any of his conclusions.
David Eddyshaw says

November 3, 2018 at 5:20 pm

It’s rarely appreciated by those who have no particular interest in the comparative linguistics of Africa just how few people have ever even attempted rigorous comparative work, let alone succeeded. This came as an actual shock to me as I got interested in Western Oti-Volta, which is a pretty close-knit group where the data are surely sufficient to do the thing properly.

The doyen of Gur comparative work was Gabriel Manessy, who achieved a great deal by sheer tenacity and industry in the face of strikingly inadequate primary sources. Nobody subsequently (in forty years) has attempted to revisit his work in any fundamental way; the time is certainly ripe for this, as the last few decades have seen very much more in the way of reliable and copious lexical and grammatical work on the relevant languages. Meantime his inevitably extremely provisional work gets enshrined in virtually every account of Gur in sources like Ethnologue. The subclassifications of Oti-Volta (to confine myself to what really I know about) in such sources are certainly erroneous, and Manessy’s very construct of Gur, which itself is not fully supported by the evidence, has become standard for want of anything better.

I could easily point to obvious errors in Manessy’s work both in general and in particular, but it would be fatuous and ungrateful: he was doing what he could, on the whole very well, with severely inadequate data. My point is that people without direct engagement with comparative work in Gur just don’t realise how shaky the foundations are for the sort of neat subclassifications you see in Ethnologue and the like and in published works.

I have no reason to suppose that Gur is an exception within Niger-Congo, apart from comparative Bantu, which has a lot of cruces of its own but in many places achieves a standard worthy of respect by any Indoeuropeanist or Algonquianist.
David Eddyshaw says

November 3, 2018 at 5:33 pm

@Y:

Blench strikes me as an überlumper, but that may very well tell you more about me than about him. He has a lot of data collection to his credit; and you can certainly believe him if even he says a language is an isolate.

I think Blench collected some of the first data on Bangime; there’s an actual Mouton Grammar Library grammar of that in the works. The speakers reckon they’re Dogon and speak a Dogon language; nobody else does.
Bathrobe says

November 3, 2018 at 7:24 pm

Sorry to interrupt a very interesting discussion, a paper has been uploaded to Academia by Stefan Müller (Humboldt Universität Berlin Institut für deutsche Sprache und Linguistik) entitled Evaluating theories: Counting nodes and the question of constituency. Might be interesting for those who are fascinated by tree diagrams.
David Eddyshaw says

November 3, 2018 at 8:06 pm

The most disturbing thing about this paper is that it cites actual published work purporting to show that strings that cannot be topicalised (in English) cannot be constituents. The author’s rubbishing of this transcendentally stupid claim is entirely justified: but the horror is that grownup linguists (and referees, presumably) might be so unfamiliar with real linguistics that it ever made it into print. I can only hope he’s misrepresenting what the papers actually claim. That may be so, as the cited paper seems in reality to use several tests for constituency.

The article says in passing

Actually the issue of determining what a word is is not trivial at all

Amen to that, anyway.
Stu Clayton says

November 3, 2018 at 9:19 pm

Actually the issue of determining what a word is is not trivial at all

That’s because it’s more a matter of defining than of determining. Defining and redefining in the hope of making the notion pragmatically useful across languages. Sometimes it seems to be a particle, at other times a wave. Maybe it’s time to shift the ol’ paradigm.

For a while now I’ve been using the expression “expression” where before I used the expression “word”, and am satisfied with the results.
David Eddyshaw says

November 3, 2018 at 9:48 pm

No: it’s not just a question of definition. There are real issues involved. The problem is that there are several perfectly reasonable criteria for wordhood, but they don’t give the same answers, not only cross-linguistically, but within a single language. RMW Dixon, the apostle of Basic Linguistic Theory, sensibly distinguishes between phonological words and grammatical words, which is a good start.

French, which people have been playing with above, is a good example from a not particularly exotic language where various criteria for wordhood mismatch quite spectacularly.

/o/ = two words …
Stu Clayton says

November 3, 2018 at 10:22 pm

I didn’t say it’s “just” a question of definition. I said it’s more a matter of defining and redefining – on the fly, I should add. Of course there are real issues involved, otherwise the whole exercise would be pointless.

/o/ = two words …

In one misleading sense that’s true. The traditional alternative is to say “there are two French words, each of which is pronounced /o/”. That’s one problem solved.

But there are two other misconceptions still lurking here. One is that a sentence 1) can be partitioned into non-overlapping “words”, 2) each of which has a meaning by itself no matter what the context.

It is a simple, incontrovertible fact that you don’t know what a French speaker means if he just says “/o/” and nothing else. That’s because it’s the speaker who means, not the word. What he means takes shape as he speaks, against the background of what has gone before and comes after, and of what the listener takes it to mean.

It is significant that misunderstanding is a phenomenon rarely addressed systematically in this connection.

It takes two to lingo.
David Eddyshaw says

November 3, 2018 at 10:24 pm

/o/ = “to the”

Two grammatical words. So is au one word or two In French?
John Cowan says

November 3, 2018 at 10:28 pm

Blench strikes me as an überlumper

I tried to get him to believe in Penutian in a private email, pointing to Marie-Lucie’s work, but he declared himself a conservative in that respect.
Stu Clayton says

November 3, 2018 at 10:33 pm

Eau. Ô. Au. Each of these is an expression. So are “to”, “two”, “to the” and “to thee”.
David Eddyshaw says

November 3, 2018 at 10:36 pm

@JC:

Another strike against my psychological theory of lumpery. More of this and I might need to abandon it, but I’ve invested too much of my prestige and self-worth in the Lumpery Program to do that as yet. Save the Phenomena! Bring more epicycles!
Stu Clayton says

November 3, 2018 at 10:44 pm

Saving the Appearances: A Study in Idolatry, Barfield.
David Eddyshaw says

November 3, 2018 at 11:01 pm

I should perhaps explain that the Lumpery Program only specifies that human beings must express Lumper or Splitter tendencies; the Program leaves open to further research whether they can be expressed serially, simultaneously, or in a null sense in any individual case, depending on the settings of the Lumpery Parameter(s). There has been much misunderstanding of these points, some of it (I am sorry to say) deliberate.
David Marjanović says

November 4, 2018 at 7:28 am

Eau. Ô. Au.

I think the point was that au is à + le and can therefore be interpreted as a sequence of two grammatical words just like à la.

(And then the whole thing applies to aux again.)
Trond Engen says

November 4, 2018 at 8:05 am

And du, de la, des. One word or two, I’m all for writing àla and dela just for the sake of symmetry. Either way, there’s still the question of whether these are words at all, or particles or locative affixes, or all of the above.

When I called the language Ofrãse rather than Frãse, it was obviously to make the name look more exotic, but I also wanted a prefigated form to make the point. I now think Ãfrãse would have been better, but I preferred Ofrãse to highlight the merged prefix. Anyway, “exotic” languages regularly have multiple name forms in the literature depending on the analysis and understanding of the author or the tradition the author belongs to.
David Eddyshaw says

November 4, 2018 at 10:02 am

This seems to be a fair account of the state of play regarding language classification in Africa:

https://www.academia.edu/22854958/Africas_Linguistic_Diversity
languagehat says

November 4, 2018 at 10:54 am

That’s Bonny Sands, “Africa’s Linguistic Diversity,” Language and Linguistics Compass 3/2 (2009): 559–580. The abstract:

African language classification in the latter half of the 20th century has been dominated by Joseph Greenberg’s work classifying African languages into four linguistic genetic groupings: Afroasiatic, Niger-Kordofanian, Nilo-Saharan, and Khoisan. Current research indicates that there are a minimum of 20 unrelated African language families or isolates. Africa’s linguistic diversity will remain poorly documented unless significant efforts are made to document some of the continent’s most endangered, underdescribed languages and language families.
David Eddyshaw says

November 4, 2018 at 1:41 pm

This is Martin Haspelmath on the problems of deciding what a “word” is, particularly cross-linguistically:

https://www.eva.mpg.de/fileadmin/content_files/staff/haspelmt/pdf/WordSegmentation.pdf

His conclusions are perhaps not a million miles away from Stu’s position.
David Marjanović says

November 4, 2018 at 7:31 pm

Sands’s paper is also notable for two other things: first, this is the first time I’ve encountered an Oxford comma before et al. by anyone other than an American lawyer; second, the reference list appears not to have been proofread, giving us this delightful typo: Insights into Nilo-Saharan language, history and vulture.
David Eddyshaw says

November 4, 2018 at 7:46 pm

I am only familiar with Niger-Congo and Afroasiatic vultures myself. I may have seen the odd Nilo-Saharan vulture when I stayed in Niamey one time, but I don’t really remember. In any case, they would have been Songhay vultures, so it’s not certain they were Nilo-Saharan.
Bathrobe says

November 4, 2018 at 8:13 pm

According to the Leipzig Glossing Rules, aux chevaux should be represented as follows in interlinear gloss:

aux chevaux
to.ART.PL horse.PL
‘to the horses’
David Eddyshaw says

November 4, 2018 at 8:18 pm

Unless aux is a clitic.
Bathrobe says

November 4, 2018 at 9:04 pm

Haspelmath’s paper is an interesting summary of a long-running issue.

The orthographic aspect is fairly crucial. In fact, even the practice of separating text written in the Roman alphabet into words only started in Ireland something over a millennium ago.

Languages like Japanese, which is of an ‘agglutinating’ nature and does not put orthographic spaces between words, present fundamental difficulties for linguistic practice.

Japanese school grammar is based on the grammar of Hashimoto Shinkichi, a grammarian of the first half of the 20th century. Hashimoto regarded lexical words decomposable into two or more morphemes, e.g. aka-hige ‘red-beard’, as single linguistic units — words. On the other hand, he treated grammatical forms like the past tense marker (-ta), which we would normally regard as an inflectional ending or at most as a suffix, as separate ‘words’.

Out of accentual considerations, however, Hashimoto did recognise that such grammatical forms should be combined into larger units that he called bunsetsu. The bunsetsu generally consists of a lexical word followed by a number of affixes and clitics (or ‘words’ in his terminology).

Modern Japanese natural language processing has now ditched the bunsetsu on vague, notional grounds, with the result that ALL of these affixes and clitics are now treated as separate words in their analysis. So a form like ketta ‘kicked’ (consisting of the verb ker- + past tense -ta) is treated as two separate words for the purpose of natural language processing. It’s like treating ‘kicked’ as two separate words: ‘kick’ and ‘-ed’.

This makes Japanese almost impossible to treat cross-linguistically in the same way as typologically similar languages like Mongolian or Turkish, which have adopted modern orthographies that combine such affixes and clitics into larger word forms.

Japanese native-speaker intuitions are a poor guide. Left to themselves, naive Japanese native speakers writing Japanese in the Roman alphabet are inconsistent in segmenting such affixes and clitics. People seem to have an intuition that such forms can be attached to lexical words, and they will write them that way at times, but their usage is often wildly inconsistent, even within the practice of a single person.
J.W. Brewer says

November 4, 2018 at 11:51 pm

How do we arrange for David Eddyshaw to be installed as Regius Professor of Lumperology at one of the more ancient universities in Britain? I might be willing to cross the Atlantic to hear the inaugural lecture and attend the no-doubt lavish accompanying social events.
languagehat says

November 5, 2018 at 8:50 am

I second the motion!
David Marjanović says

November 5, 2018 at 10:02 am

Hear, hear!
AJP Crown says

November 5, 2018 at 10:02 am

I object on principle to regius professorships. Just thought I’d mention it. However, I recently discovered there are still regius professors being appointed at TCD even though I’d been led to believe that the Republic of Ireland was, you know, monarchically challenged nowadays.
languagehat says

November 5, 2018 at 10:15 am

Well, regius can have the metaphorical meaning of ‘magnificent,’ so let’s go with that.
Stu Clayton says

November 5, 2018 at 11:10 am

From egregius, meaning eminent.
AJP Crown says

November 5, 2018 at 11:14 am

Ok. Eddyshaw the Magnificent also has a ring to it.
AJP Crown says

November 5, 2018 at 11:20 am

Eddyshaw the Egregi(o)us might be open to misunderstanding.
John Cowan says

November 5, 2018 at 12:05 pm

After the Romans abolished their kings, they still kept the Rex Sacrorum to perform the rituals that formerly only the kings could do. This office had more prestige but far less power than Pontifex Maximus.
Stu Clayton says

November 5, 2018 at 12:12 pm

Not in lettered circles.
Lazar says

November 5, 2018 at 1:41 pm

The Anglosphere also seems to have long ago forgotten that a county is the domain ruled by a count (or earl).
David Eddyshaw says

November 5, 2018 at 2:22 pm

I am humbled by and grateful for these kind proposals, but I am content to remain known by the modest Eddyshaw hereditary title of Autokrator.
John Cowan says

November 5, 2018 at 2:46 pm

Avtokrator nowadays, surely?
J.W. Brewer says

November 5, 2018 at 2:48 pm

Perhaps autocracy is just extreme lumperism applied to the question of how political power should be optimally allocated?
David Eddyshaw says

November 5, 2018 at 3:05 pm

Avtokrator

While We Ourself naturally prefer the Old Ways, it pleases Us to permit the Demos to offer respect in accordance with their modern customs.
languagehat says

November 5, 2018 at 5:33 pm

It’s nice to have an autocrat who permits, even if he does not encourage, alterations in custom!
Trond Engen says

November 5, 2018 at 6:22 pm

Lumperism in its current minimalist form claims that whatever you have when there’s nothing more to split is a lump, and the existence of this lump is deeply interesting — and more so the smaller it is.
David Eddyshaw says

November 5, 2018 at 6:57 pm

@Trond:

Enough of this democritic propaganda.
J.W. Brewer says

November 5, 2018 at 7:08 pm

The lumper/splitter dichotomy transcended, according to the aged Simeon in the Temple (as transcribed by Auden):

Because in Him the Word is united to the Flesh without
loss of perfection, Reason is redeemed from incestuous fixa-
tion on her own Logic, for the One and the Many are simul-
taneously revealed as real. So that we may no longer, with
the Barbarians, deny the Unity, asserting that there are as
many gods as there are creatures, nor, with the philosophers,
deny the Multiplicity, asserting that God is One who has
no need of friends and is indifferent to a World of Time
and Quantity and Horror which He did not create, nor,
with Israel, may we limit the co-inherence of the One and
the Many to a special case, asserting that God is only con-
cerned with and of concern to that People whom out of all
that He created He has chosen for His own.

For the Truth is indeed One, without which is no salva-
tion, but the possibilities of real knowledge are as many as
are the creatures in the very real and most exciting universe
that God creates with and for His love, and it is not Nature
which is one public illusion, but we who have each our
many private illusions about Nature.
bulbul says

November 6, 2018 at 2:37 pm

Fun continues:

Indeed, there is a considerable difference between establishing a generalization (for example, that the DOM follows from “the universal tendency for prominent objects to get special marking”, in Haspelmath’s words) and specifying the computational mechanisms underlying these facts, which is what Kalin tries to do. Obviously, these are not incompatible tasks; rather, they are complementary. Haspelmath assumes that the DOM is already explained, and that it is sufficiently understood, but such a conclusion is inadmissible from the point of view of those whose aim is to reveal the internal computational mechanisms that derive the grammatical forms that we observe. Allow me a comparison: the discovery that a physical feature of a living being (a wing, for example) developed through evolution from another (a leg) does not guarantee the understanding of what genetic and biochemical mechanisms made the process possible and determined the ontogenetic development of wings in a given organism.

DavidM, care to comment on the analogy?
Bathrobe says

November 6, 2018 at 6:54 pm

The bigger question is whether these so-called ‘internal computational mechanisms’ are actually equivalent to an ‘understanding of what genetic and biochemical mechanisms made the process [of the evolution of a wing from a leg] possible and determined the ontogenetic development of wings in a given organism’.
Bathrobe says

November 6, 2018 at 7:23 pm

Haspelmath’s reply at that post points to his 2004 paper “Does linguistic explanation presuppose linguistic description?”.

It appears to be a feature of modern linguistics that an immense amount of energy, mental and otherwise, must be expended struggling with/against/for the chimaera of Chomskyan formalism and Universal Grammar. Not everyone feels it is worth the effort.

We will only be put out of our misery when Chomskyanism is either proved to be incontrovertibly correct or decisively wrong. In the meantime, many are just going off doing their own thing rather than wasting their time in the fray.
David Eddyshaw says

November 6, 2018 at 7:57 pm

I don’t think Chomskyanism (especially in its MInimalist guise) as such is actually susceptible of incontrovertible proof or disproof in any normal scientific sense*; that does not in itself necessarily entail that it is valueless. It might be valuable, for example, (qua “research program”) in giving birth to many non-trivial individual theories which were either valid and fruitful in themselves, or fruitful in provoking definitive rebuttals which themselves helped us to understand Language better.

The degree to which this has actually happened is (as they say) an empirical question.

*I may be being unfair; if any passing Chomskyans can say what sort of (humanly possible) language data would constitute an unequivocal disproof of the validity of the total system I promise to eat my words. Though I suppose the crux of the matter would be “humanly possible”; it’s not obvious to me how to make this a viable criterion without ending up in circularity. (The sort of thing I wouldn’t accept would be “languages without constituency”, because I don’t believe that it’s even meaningful to call a system without constituency a “language”, whether spoken by humans or Martians. If UG means no more than that it’s vacuous.)
David Eddyshaw says

November 6, 2018 at 8:24 pm

If, however (as Adger’s remarks implied) the Minimalist Program does in fact entail non-trivial consequences for how any individual language must be analysed, then what I would like to know is what sort of language data would show that such an analysis was mistaken. (An analysis that could be adapted without fundamental change to more or less any realistic data would not be a feature but a bug.)

The “list of achievements” in fact includes at least one which is false. If the “achievements” are indeed consequences of the theory, the theory has already been refuted and is a zombie. However, I would be perfectly happy to accept an admission in such cases that the supposed “achievement” was a mistake, and a promise to be more careful about making such claims in future.
Bathrobe says

November 6, 2018 at 8:32 pm

Constituency is a reality, no doubt.

Constituency at Wikipedia points to an article called Tests for constituents: What they really reveal about the nature of syntactic structure by Timothy Osborne. Osborne comes down on the side of dependency grammar.

I’ve always been struck by how Anglocentric discussions of constituency are. The problem is, when they try to extend their ‘tests’ to foreign languages, the whole exercise becomes pure torture.
Stu Clayton says

November 6, 2018 at 8:53 pm

giving birth to many non-trivial individual theories which were either valid and fruitful in themselves…

One example is the (1956) Chomsky containment “hierarchy” of languages generated by production rules. This became the basis for creating programming languages in a systematic way.

If there is a further example of a *fruitful* consequence of Chomsky’s mathematical stuff, I’ve never heard of it. Of course computational linguistics keeps a few people in jobs, but it’s a free country we live in.
David Marjanović says

November 6, 2018 at 9:03 pm

DavidM, care to comment on the analogy?

It looks fine, except that we know plenty of concrete things about development genetics but next to nothing about “the internal computational mechanisms that derive the grammatical forms that we observe”, so – as Bathrobe said – it’s not possible to tell if any such “mechanisms” the author might have in mind are similar to development genetics.
David Eddyshaw says

November 6, 2018 at 9:20 pm

It occurs to me that good and deserving paradigms like Structuralism are not refutable; they’re not theories, but methodologies.

If the Minimalist Program is meant to be something ontologically similar, and to have very little propositional content in itself, with no logically entailed corollaries like X-bar theory or whatever, then the question we need to be asking is a bit different.

It becomes analogous to testing a new drug. It’s not enough to show that a drug works. You have to show it works because of the specific properties of that drug. To do this, it must be tested against an existing drug and/or placebo.

So the questions to ask in the event of a valid discovery by a proponent of Minimalism are things like “in what respect is this insight specifically attributable to your orientation towards the Minimalist Program? Could it be expressed without significant loss in terms of a different paradigm? Would you have investigated this issue if you had not been inspired to do so by the Program?”
Brett says

November 6, 2018 at 9:45 pm

@bulbul: I have an anecdote that is relevant to that analogy. On the first day of the Animal Behavior class he taught, Alan Hein emphasized that any question of Why? in studying behaviors can be answered on numerous levels. If you want to know why a male redwing blackbird sings, it could be because of enlarged gonads leading to hormonal changes in his body; or it could be said that he sings to attract a mate; or it could be because a particular pattern of evolutionary selection events led to his lineage developing that particular song. These are all reasonable answers to the question, and if you really want to understand the behavior, you probably should consider the question from all these viewpoints.

Hein “illustrated” these points with images of birds taken from A Field Guide to Little-Known and Seldom-Seen Birds of North America. Unfortunately, none of the really funny birds (like the military warbler) are featured in the PDF sample at the publisher’s Web site.
bulbul says

November 7, 2018 at 4:45 am

DavidE,
good and deserving paradigms like Structuralism
I knew I liked you.

Stu,
Of course computational linguistics keeps a few people in jobs, but it’s a free country we live in.
Today’s computational linguistics is decidedly non-Chomskyan, i.e. almost entirely statistics-based.

DavidM,
it’s not possible to tell if any such “mechanisms” the author might have in mind are similar to development genetics
So then the analogy falls apart, doesn’t it? I don’t know enough (or, indeed, anything) about biology, but analogies like this strike me as horseshit of the stinkiest sort. Whether they compare their work to particle physics or to molecular biology, the next question that should be asked is “then what are your equivalents of quarks/molecules”?

Bathrobe,
depends on what you mean by ‘constituency’. Does it include the notion of a verbal phrase?
Speaking of:
I’ve always been struck by how Anglocentric discussions of constituency are.
Here’s a funny thing: we give generativists a lot of shit for being Anglo-centric and rightly so. But they’re not alone in this: prominent members of the Prague Linguistic Circle like Vachek and Mathesius also did most of their work on English and base a lot of their theories on it. The major difference, of course, is that they did not look only at English and so weren’t blinded by it, plus Mathesius did not build a cult of personality around himself the way Chomsky did. Consequently, the Prague School as represented by Firbas and Sgall (who for a moment there got the Generativist bug, but recovered quickly) when faced with the facts of Czech, never really got sold on the concept of constituency.

Brett,
if you don’t understand the difference between the question “why does a male redwing blackbirg sing” and “why doesn’t Czech have a fixed slot for an object when English does”, then I don’t really know what to say to you. Also, see above re hormonal changes (molecular biology). Alsoalso, “understand the behavior”? Careful there.
David Eddyshaw says

November 7, 2018 at 6:11 am

All I meant by “constituency” really (shouldn’t have used the term, as it has technical meanings of its own), was the presence of some sort of hierarchy in grammar, which seems to be what the UG people are on about with their “merge.” It seems clear that there are languages without verb phrases and even some without noun phrases, but I think you can still make good-faith arguments for some hierarchy in those, even if it’s pretty minimal (so to speak) and/or sublimated into morphology. All I would claim is that if a system doesn’t have any sort of structure apart from the lexicon it isn’t a language, so if Minimalism posits no more than that, it’s empty (though even that might not be a fatal criticism if it’s supposed to function as some sort of methodological heuristic rather than a theory. If it does so function.)
David Marjanović says

November 7, 2018 at 6:26 am

So then the analogy falls apart, doesn’t it? I don’t know enough (or, indeed, anything) about biology, but analogies like this strike me as horseshit of the stinkiest sort. Whether they compare their work to particle physics or to molecular biology, the next question that should be asked is “then what are your equivalents of quarks/molecules”?

Their answer is “we’re working on discovering them”. Of course nobody has any clue about which tree they’re barking up.

if you don’t understand the difference between the question “why does a male redwing blackbirg sing” and “why doesn’t Czech have a fixed slot for an object when English does”, then I don’t really know what to say to you.

Which difference do you mean?

It seems clear that there are languages without verb phrases and even some without noun phrases, but I think you can still make good-faith arguments for some hierarchy in those, even if it’s pretty minimal (so to speak) and/or sublimated into morphology.

So, in Nunggubuyu it’s in the morphology?

even some without noun phrases

A few Salishan languages, IIRC, really seem to have no nouns distinct from verbs at all; even proper names can do all the things verbs do.

(Not to be confused with other languages in the vicinity, where nouns can do almost all the things verbs do.)
Y says

November 7, 2018 at 6:31 am

never really got sold on the concept of constituency
bulbul, what situation are you thinking of where the concept of constituency doesn’t work well in Czech?
David Eddyshaw says

November 7, 2018 at 8:01 am

So, in Nunggubuyu it’s in the morphology?

I have tried and failed miserably to absorb Jeffrey Heath’s grammar, but I believe that’s a reasonable statement.

It certainly seems to be so with Bininj Gunwok, of which there is a very accessible grammar by Nicholas Evans. It was that language I was thinking of in saying that languages may not invariably have NPs, what with all the head-marking and polysynthesis and all. Yimas seems to have NPs, but only of a very minimal sort with no recursive possibilities.

The Salishan noun/verb thing is a separate issue (Tagalog is another potential case.) Salishan languages certainly do have NPs. Or VPs. Whatever.
J Pystynen says

November 7, 2018 at 9:44 am

If the Minimalist Program is meant to be something ontologically similar,

Ontology is IMO the crucial problem of generativist theories. Haspelmath also talks about this. Are things like deep structure or movement actually real in any sense? If yes, in what exact sense? If no, why should we care about them? If to some extent no, but treated as yes, does this not set up a major GIGO problem for all analyses under this paradigm?

I don’t know gruel from porridge about syntax in particular, but I do know that generative phonology has a bad habit of reifying latent rubble left by historical changes as still synchronic but possibly “deeply ranked” transformations. My hunch is that this is exactly the case with generative syntax as well. It is already a bad sign if someone will talk your ears off about ontogeny but pays zero attention to phylogeny, a worse one yet is trying to reduce all whys to ontogeny, and an obvious case of going off the deep end when problems in this approach are resolved as happening due to the effects of phlogiston (“we conclude that a tailbone must be provided by Universal Anatomy”).
bulbul says

November 7, 2018 at 3:06 pm

Y,

bulbul, what situation are you thinking of where the concept of constituency doesn’t work well in Czech?
It’s not that I think the concept as such doesn’t work well, just some types of it don’t, especially those that involve the existence of a broadly defined VP (verb + object). When they come across, say, the OSV order, they scratch their head and mumble something about movements and dislocations.
David Marjanović says

November 7, 2018 at 8:51 pm

It is already a bad sign if someone will talk your ears off about ontogeny but pays zero attention to phylogeny, a worse one yet is trying to reduce all whys to ontogeny, and an obvious case of going off the deep end when problems in this approach are resolved as happening due to the effects of phlogiston (“we conclude that a tailbone must be provided by Universal Anatomy”).

This is actually a great argument for why long-range language phylogeny should be of high interest to linguists who don’t give a vertical gene transfer about whether Sumerian is closer to Basque, English or Santali. Are there features that most or all known natural languages share, which they share simply by common inheritance? At this point, we have no idea.
John Cowan says

November 8, 2018 at 6:09 pm

I may be being unfair; if any passing Chomskyans can say what sort of (humanly possible) language data would constitute an unequivocal disproof of the validity of the total system I promise to eat my words.

As I have said, the disproof must come from ethology or zoology; if we can find animals who communicate with recursive structures, Minimalism is definitively disproved, for it asserts that recursion is a purely human faculty. Of course no scientific theory can be unequivocally proved.

It becomes analogous to testing a new drug. It’s not enough to show that a drug works. You have to show it works because of the specific properties of that drug. To do this, it must be tested against an existing drug and/or placebo.

Actually, in the U.S. at least this is not so: a drug merely has to be safe and effective. The newer “non-drowsy” allergy drugs, on which several fortunes have been made, are actually less effective than the older anti-histamine drugs.

any question of Why? in studying behaviors can be answered on numerous levels

“[Mechanical] Cash registers don’t really compute totals, they just grind their gears. Actually, they don’t really grind their gears either, they just obey the laws of physics.” –unknown

pure logic dictates that there must have been an intermediate stage where the opposition of high and low tones turned into something else (level and oblique?) and then turned back again into a typical West African terracing system

Not necessarily. In one Austronesian language the semantic meanings of words have been switched: kampung means not ‘village’ but ‘forest’ (sorry, I don’t remember its name). This can’t be a natural sound change, it has to be a language game that got perpetuated because kids thought it was just the way things were.
David Marjanović says

November 8, 2018 at 6:48 pm

Phonological flip-flops happen, though. Tatar has one in its vowel system – and Kazakh is halfway through it. It’s actually two little swirls.
David Eddyshaw says

November 8, 2018 at 6:48 pm

@JC:

I don’t believe human cognitive capacity is up to simply switching a pair of entire tonemes (or any other pair of phonemes) with complete consistency across the entire language without any intermediate steps and keeping it up so consistently that the next generation simply adopts it wholesale.

The H/L flip, implausible though it looks, has happened elsewhere in Niger-Congo; I believe there are Bantu groups which have done this, but I can’t remember where I read that now.

I have oversimplified the situation in Oti-Volta a bit, too, which may very well bear on the supposed mystery. Many languages have three-tone systems, and even the two-tone systems have been conjectured to have a three-way contrast between H, L and toneless (wrongly, in my view, in the case of Western Oti-Volta.) So there’s quite a lot of potential scope for alternative analyses of how it all happened, at any rate.

Norbert now runs that blog

That’s the exact plot of Tinker, Tailor, Soldier, Spy!
Bathrobe says

November 8, 2018 at 7:00 pm

One paragraph in the Haspelmath article hit a chord:

But why should [generative categories] be used for simple description? It seems to me that there is an embarrassing misunderstanding here: Teaching generative syntax has become a routine phenomenon in many corners of linguistics, and more and more people who are simply interested in languages … got the impression that the notation was somehow established knowledge of the discipline of linguistics, maybe like the notation of chemical formulas has become an established part of chemistry. But of course, the proposals of current mainstream generative grammar are just one out of a very large number of possibilities (cf. McCawley’s (1982) book title “Thirty million theories of grammar”, which has lost nothing of its relevance).

The dreariness of generative linguistics is that it often tries to project categories that it has developed for English onto other languages. The Wikipedia article on Determiner phrase (DP) notes the following:

For West Greenlandic, which is a language without overt determiner elements, it is argued that this absence does not mean the absence of a DP layer. The determiner head is the locus of possessor agreement features because a functional head higher than possessor is needed for agreement. It is assumed that there is a PossP located above NumP, which is the structural layer whose specifier hosts the possessor.

….A head is also needed for valuation of definiteness. The determiner head in West Greenlandic also hosts an uninterpretable definiteness feature (uDEF[ ]), which is present on the highest projection of the noun phrase.

And for Serbo-Croatian:

The Serbo-Croatian language does not have articles, despite articles being extensively argued as occupying the head position of a DP across many languages. However it is argued that this language, despite not having articles, projects a DP on top of NPs in its argument positions.

It is very difficult to get excited about a theory whose practitioners will go to such lengths to posit analyses that bear no visible relationship to the languages they are analysing.

Attempts to export English constituent analyses are equally dreary. The primal Subject-Predicate split seems to be taken from traditional grammar and, I suspect, harks back to the ancient Greeks. Dependency theory dispenses with it but the NP-VP analysis is still regarded as gospel truth by the generativists.
Trond Engen says

November 8, 2018 at 7:00 pm

Scandinavian again. The multi-fracture faultline between East and West Scandinavian through Norway entails an isotone between Western high-pitch and Eastern low-pitch dialects. The latter must at some point have switched to marking word stress by going down.
David Eddyshaw says

November 8, 2018 at 7:05 pm

Yet more evidence for the Scandi-Congo hypothesis!

It’s clear that pairs of phonemes can switch. What I specifically disbelieve is that it can happen without intermediate steps.
David Eddyshaw says

November 8, 2018 at 7:10 pm

The dreariness of generative linguistics is that it often tries to project categories that it has developed for English onto other languages

This is what comes of abandoning the good old tradition of describing all languages in terms of Latin grammar. And in Latin. Young whippersnapper generativists!
Trond Engen says

November 8, 2018 at 7:15 pm

David E.: the Scando-Niger-Congo hypothesis!

I’m agnostic.to that. There’s nothing in the data that can’t be explained equally well by a strong substrate, i.e. recent language replacement.
David Eddyshaw says

November 8, 2018 at 7:27 pm

Evidently this has a bearing on the matter:

https://en.m.wikipedia.org/wiki/Halfdan_the_Black
January First-of-May says

November 8, 2018 at 10:52 pm

It’s clear that pairs of phonemes can switch

As somewhat infamously featured in the confusion of alumni and alumnae (as discussed on Language Hat a while back, forgot where exactly) – the respective Latin endings had basically switched in the traditional English pronunciation.

(I’m sure there are better examples, but offhand I don’t recall any.)
David Marjanović says

November 9, 2018 at 9:09 am

Fun with flip-flops in the consonant system.
Rodger C says

November 9, 2018 at 9:49 am

The H/L flip, implausible though it looks, has happened elsewhere in Niger-Congo; I believe there are Bantu groups which have done this, but I can’t remember where I read that now.

I recall being told about an instance in Athapaskan.
David Marjanović says

November 9, 2018 at 10:13 am

That one’s probably fake; the situation is explained as Proto-Athapaskan being toneless and having lots of glottal stops (like Eyak) that became a high tone in most but a low tone in some languages. (But I wonder if it was already a rising tone, and level vs. rising became two level tones in two different ways.)

Meanwhile, “Concluding remarks” on French polysynthesis:

♣ S[poken] F[rench] has developed a full-fledged system of bound pronominal affixes on the verb, filling at least three argument positions: Subject, Primary/Direct object, and Secondary/Indirect object; these elements are grammaticalized enough both formally and functionally to be regarded as affixes and not clitics.

♣ SF extensively uses topic and antitopic constructions, putting full NPs outside the core of the clause; these NPs are cross-referenced by the bound pronouns, which occupy the argument positions in the core of the clause.

♣ However, there are instances when SF uses ‘classical’ SVO sentence structures with argument positions filled by full NPs; these constructions are used when the respective arguments are low in discourse salience and semantic/pragmatic prominence.

♣ Thus, SF may be regarded as a language where genuinely polysynthetic morphosyntax coincides with ‘standard’ ‘configurational’ morphosyntax, these two types of clause structure having different pragmatic functions and motivations.

♣ The situation found in SF is typologically non-unique and is observed in other languages as well; there is strong evidence for regarding SF as exhibiting a rather typical situation of a language undergoing diachronic change from ‘configurational’ to ‘polysynthetic’ morphosyntax.
David Eddyshaw says

November 9, 2018 at 10:45 am

I think Athapaskan is different, because the whole tonogenesis thing postdates the breakup of the group, so it’s not surprising it might go differently in different daughter languages. I think there are even Athapaskan languages where tone hasn’t developed.
Rodger C says

November 9, 2018 at 12:23 pm

S[poken] F[rench] has developed a full-fledged system of bound pronominal affixes on the verb … ; these elements are grammaticalized enough both formally and functionally to be regarded as affixes and not clitics.

IIRC closely analogous affixes in Coptic are written solid with the verb.
J Pystynen says

November 9, 2018 at 8:40 pm

Are there features that most or all known natural languages share, which they share simply by common inheritance?

Give it some decades, and any currently-rare feature could end up in this situation; e.g. non-paralinguistic bilabial clicks. I’m sure many earlier rarities have similarly gone extinct a while ago and today appear to be universals.

Though, for examples like these, I’m also sure some evolutionary linguist has already proposed that the clickiness of Khoisan languages would be actually genetic. I don’t think we know of any securely established instances of genetic population differences surfacing in language … but this cannot be a priori ruled out either; especially not if one assumes that humans possess more or less extensive Universal Grammar(s).
Trond Engen says

November 9, 2018 at 9:12 pm

David M.: That one’s probably fake; the situation is explained as Proto-Athapaskan being toneless and having lots of glottal stops (like Eyak) that became a high tone in most but a low tone in some languages.

Possible truth be possibly told, this may also have been the case in Scandinavian.

(But I wonder if it was already a rising tone, and level vs. rising became two level tones in two different ways.)

Not the “two level tones” part of this, though.
David Eddyshaw says

November 9, 2018 at 9:41 pm

But I wonder if it was already a rising tone, and level vs. rising became two level tones in two different ways

I’ve wondered about that for Oti-Volta. The Oti-Volta subgroups that invert the tones are not obviously closer to each other than those that don’t (I think you can make an argument that they do in fact form a primary division of the group, but given the primitive state of comparative work it’s fatally easy to start cherry-picking the evidence to make it fit.) If the Oti-Volta protolanguage actually had a level/oblique distinction rather than high/low it would help to explain how the tonal evidence for subgrouping doesn’t obviously match the rest by making the tonal developments more natural.

Against that is the fact that terracing systems with predominantly level tones are very much the prevailing type in the actual extant languages; but then, you don’t see many reflexes of the laryngeals in the extant Indoeuropean languages either, and if it weren’t for India, you wouldn’t see any voiced aspirates: there’s no rule that a protolanguage has to much resemble any of its descendants. And there’s the further analogy that unusual systems are likely to prove unstable, which would actually account for why they have given rise to several variant outcomes in the course of being “normalised.”

You’d still have to say that the non-inverting Oti-Volta languages have reverted to the original type of tone system after having passed through a period with a different type; but stranger things have happened.
John Cowan says

November 9, 2018 at 11:09 pm

My comment at Haspelmath’s post, with his reply. “It furthers one to see the great man.”
Bathrobe says

November 10, 2018 at 1:20 am

Um, I didn’t know what CP and IP were and therefore had no notion what you were talking about — until you gave examples.

Once I’d seen examples, “A CP-V2 language for the paper’s purposes is one that is V2 in main clauses only, whereas an IP-V2 language is V2 in all clauses” started to make sense.

But do we need such tortured terminology? Sure, “CP-V2” and “IP-V2” give you a monicker to distinguish the two types of verb positioning outside main clauses . Just as “phrasal verb” is a useful monicker for a particular type of construction in English, one that doesn’t actually make much sense when analysed closely. But monickers that presuppose such blinkered analyses of language are, in general, more a blight than a boon.

I find terms like “Vorfeld”, “Mittelfeld”, and “Nachfeld” (proposed before WW2) far more terminologically transparent than “CP-V2” and “IP-V2”, because they are accessible and don’t presuppose much familiarity with esoteric linguistics beyond “finite verb”.

And if you want to describe the distribution of adverbs in French and English by distinguishing between “verb to affix raising” and “verb to affix lowering”, well, good luck to you.

But perhaps I’m just a “Grand Old Fart” who can’t keep his head above the rising tide of Generative Grammar. The young’uns have no trouble at all, having been schooled from the start to see it as “standard”.
Bathrobe says

November 10, 2018 at 1:36 am

Incidentally, Dominik Lukeš’s post Why Chomsky Doesn’t Count as a Gifted Linguist is also worth reading.
David Marjanović says

November 10, 2018 at 7:55 am

Um, I didn’t know what CP and IP were and therefore had no notion what you were talking about — until you gave examples.

Same here.

Fun fact: Mittelfeld has become a technical term in football.
John Cowan says

November 10, 2018 at 6:11 pm

Once I’d seen examples, “A CP-V2 language for the paper’s purposes is one that is V2 in main clauses only, whereas an IP-V2 language is V2 in all clauses” started to make sense.

I don’t know what’s confusing about that, unless you don’t know what “V2” means, and once you realize that these are (local) definitions of the terms. Still, there is no doubt that I often do assume that I write or speak more clearly than I really do.
Bathrobe says

November 10, 2018 at 8:09 pm

No, I wasn’t familiar with V2 either, which made it harder. But you were clear enough. It was my unfamiliarity with the phenomenon and the stacked-up terminology that baffled me, especially the CP/IP bit.

As they say, a picture is worth a thousand words, and it was your examples that brought it into focus.
Bathrobe says

November 10, 2018 at 11:43 pm

I would like to add here that I find much linguistics literature (especially of a cross-linguistic variety) disconcerting because linguists have a habit of just quoting a single sentence (perhaps with interlinear gloss) in support of the theoretical point they are making. They will continue on as if that single sentence were all you needed to understand what is going on and how it fits into the overall scheme of the grammar. It’s like looking through a long black tube and being unable to see what is to the left or right, above or below. I personally feel a strong need to see quite a few examples so that I can get some kind of feel for the grammar of the language and how it expresses things.

I also find it difficult when linguists use technical descriptions that are hard to picture in one’s mind’s eye. Language is far easier to grasp as visually discernible patterns than as abstract statements about NPs, clauses, etc., or movements from lower nodes that are blocked by such-and-such a boundary. A lot of grammar is easier to comprehend in visual form than in the form of abstract descriptions.

Or perhaps I am just not temperamentally suited to abstractions that need to be expressed through the murky medium of language. 🙂
SFReader says

November 11, 2018 at 12:38 am

I think the problems you raise occur when linguists offer examples from languages which they don’t actually know – they just mined them from literature.

So, they can’t really expand on the semantics of the point they are trying to make, because they don’t know the language and are forced to rely on the example they faithfully copied.

When it’s the language they do know, in my experience, they usually keep throwing example sentences until reader gets bored.
Brett says

November 11, 2018 at 3:10 pm

Scanning this thread, I was initially confused how the discussion had managed to turn to the “TCP/IP bit.”
Y says

November 12, 2018 at 9:15 pm

I think there are even Athapaskan languages where tone hasn’t developed.

Those would be the languages of the Pacific branch.
Sapir was convinced that Proto-Athabascan had tone, and figured that the Pacific ones had tone, but that he couldn’t hear it. He took on Li Fang-Kuei as a student and sent him to study Mattole, a language of northern California, assuming that as a native Chinese speaker he could hear tones better. Li didn’t hear tones, either, because they weren’t there (but he did produce a grammar of Mattole).
Bathrobe says

November 12, 2018 at 9:53 pm

Scanning this thread, I was initially confused how the discussion had managed to turn to the “TCP/IP bit.”

Sorry Brett, I should have made clear that I was referring to JC’s comment at the Haspelmath post.
John Cowan says

November 21, 2019 at 3:25 pm

Bulgarian alone has freed itself from this incubus.

Wherefore I was discussing Bulgarian with a Bulgarian on Quora who believes that its lack of noun case proves it is not Slavic. I offered him English, whose lack of noun case and massive borrowings from Latin and French do not prove it is not Germanic. Haven’t heard back … yet.
David Eddyshaw says

November 21, 2019 at 3:53 pm

I missed Bathrobe’s link to Lukeš last year; it is indeed worth reading (and I agree with the praise of MAK Halliday in particular.)
John Cowan says

December 8, 2019 at 1:23 pm

(The translation [of The Loom of Language] contains other substantive changes, like replacing the chapter about German with one about English.)

Good gravy. Who wrote that? Hofstadter’s reference to translating a history of France in French into German in such a way that it became a history of Germany (presumably starting from Charlemagne / Karl der Große and diverging) was supposed to be a joke. On the other hand, perhaps Bodmer did write it, presumably at the request of his German publishers.
David Marjanović says

December 8, 2019 at 4:17 pm

I think the translators wrote it.

The original chapter is specifically about German for English-speaking learners, so, in the 1950s, it made sense to replace it by a chapter about English for German-speaking learners. Nowadays it might just be dropped.
John Cowan says

March 10, 2020 at 4:27 pm

Haspelmath has just published (on Academia only, at least for now) the latest (February 2020) in his series of papers on 正名. This one is called “General linguistics must be based on universals (or nonconventional aspects of language)”. By “nonconventional” he means not “unusual” but “not based on the conventional part of language”, examples being slips of the tongue and learnability.

(Note that the order in which I mention things is not Haspelmath’s.)

Haspelmath gives us a bunch of Aristotelian dualities. At the top level is theoretical vs. applied lingustics; he is concerned only with the former. Then he revives the term structural linguistics (h/t Pesetksy) to distinguish that part of the subject from sociolinguistics, psycholinguistics, historical linguistics, et hoc genus omne.

He rejects the term empirical linguistics, saying that there are no non-empirical sciences, and those who do empirical work in linguistics also do theoretical work. I wonder if whoever coined that term was thinking more of empirique, which has a slightly different flavor from empirical. Even in English empiric means ‘quack’, though empirical treatment, meaning medical treatment without a current theoretical explanation, is perfectly respectable and indeed unavoidable.

Then the meat of the matter: general linguistics vs. (language-)particular linguistics (he thinks this may be a neologism in English, but in German the term Einzelsprachlinguistik is accepted). There is a historical section, in which he talks about the unintended collapse of general structural linguistics in Bloomfield’s day and its revival (yoked to a particular program, see below) in Chomsky’s. The “general linguistics paradox” surfaces: we want to know about Human Language (as he capitalizes it) in general, but all we can observe is particular languages in their diversity.

He says there are two basic solutions to the paradox (besides the non-conventional methods mentioned earlier), based on the same three steps of description, comparison, and explanation which all sciences make use of. The one he favors is the Boasian/Greenbergian program, in which the object to be described is the social rule system of a language, the comparison is of empirical universals, and the explanations are functional/adaptive.

In the other one, the “natural kinds” or Mendeleyevian/Bakerian program, the object to be described is a mental grammar, comparisons are of universals imposed by biocognitive constraints, and explanations are the “postulation of an innate grammar blueprint consisting of universal building blocks” (hence the connection with natural kinds and Mendeleyev). The practitioners of this often call it linguistic theory, another term he rejects as misleading and basically influenced by the increasing prestige of theory (the term) in linguistics since 1955.

The second program, he says, was mostly abandoned by Chomsky in 2005, and most papers in the generative tradition that are about particular languages now have a more or less clear social-rule description that any linguist can understand, followed by a rewriting of it into the arcane vocabulary of whatever flavor of universal building blocks are currently in fashion (or in a few cases not in fashion), without regard to how they are supposed to cohere, as in his discussion of a paper on Russian Nominalizations (capitalized by Haspelmath as a particular-language term) that contains a section of “analysis” written in Chomskyan jargon but not mentioning any language except Russian, with no real justification for its presence except tradition.

The paper ends optimistically:

Thus, whatever one’s hunches about the best path leading to deeper understanding of Human Language: All general linguists need a clearer methodology for language comparison. Despite many obvious similarities between languages, they appear to have different categories and features, and we need something extra to make the study of particular languages fruitful for general linguistics.

For linguists working in the generative tradition, this means figuring out which features and categories [and constraints] (and architectures) are innate and can be expected to occur in any language. For linguists working in the Boasian/Greenbergian tradition, this means being careful about their characterization of comparative concepts as uniform yardsticks for comparison.

If all goes well, the two approaches should eventually converge, i.e. evidence for innateness should converge with the empirical universals found through the non-aprioristic approach.

I’m not sure whether to end with “And so say we all” or “Isn’t it pretty to think so?”, so I’ll leave the Hattics to pick their own ending.
John Cowan says

June 1, 2022 at 3:10 pm

Whether Adger’s comments about his Kiowa work are execration-worthy depends on the degree to which he would in fact be prepared to discard hypotheses which are contradicted by the data.

Well, what can you expect from someone whose surname has acquired the meaning ‘to make a bonehead move with consequences that could have been foreseen with even slight mental effort’? (To be fair, this term alludes not to David Adger but to his possible relative Ellison Adger Williams III, all of whose names are surnames and who is described in the Jargon File as “an infamous tenured [i.e. “ten-yeared”] graduate student”.)

Lojban nails down this business of compounding, because whether something is a free or a bound form is right there in the vowels and consonants, with no suprasegmental subtleties. Mamta ‘mother’ is free, and mam ‘id.’ is bound, and that’s that. Every nounverb in the language has at least two bound forms, and some have as many as five. In final position the bound form may be identical to the free form, but can be distinguished from it because whatever precedes it is one or more unambiguously bound forms: an example is mamymamta ‘mother’s mother’, where -y- is inserted to prevent gemination, which is phonologically forbidden. This mamta is bound because *mamy cannot be a free form due to its, er, form.

So far so fairly ordinary. What makes it break the universal is that there are bound forms for both ti, ta, tu ‘this, that, yon’ and vi, va, vu ‘here, there, yonder’, so you can create a compound like viznau ‘yonder-man’ < vi nanmu ‘yonder man’. Formally this is exactly like tornau ‘short-man’ < tordu nanmu ‘short man’. Similarly, there are compounding forms for mi ‘1SG’ and do ‘2SG’, giving useful compounds like mibgu’e ‘my country’ < mi gugde and doigu’e ‘your country’ < do gugde. The compounds are not drop-in replacements for their non-compounded equivalents: la amerikas. mibgu’e ‘America is my country’ is grammatical, but *la amerikas. mi gugde is not; it has to be la amerikas. du lo mi gugde ‘America is-identical-with a (my country)’ or the like.
John Cowan says

April 12, 2023 at 4:17 pm

In one Austronesian language the semantic meanings of words have been switched: kampung means not ‘village’ but ‘forest’ (sorry, I don’t remember its name).

Iban, for the record. It’s a basal Malayic language. More examples here.

viznau ‘yonder-man’ < vi nanmu ‘yonder man’

That was a blunder: either the examples need to be vuznau and vu nanmu, or the glosses need to be ‘here-man’ and ‘nearby man’.
John Cowan says

November 9, 2023 at 6:35 am

The geographer is much too important to go loafing about.

I don’t know what the French original says, but loafing about is the last phrase I would apply to the activities of explorers. For the generative view of the activities of non-generativists, however, it appears to be spot-on.

I don’t know how the GG people fit it into their system, though Norvin may well do.

That nails it down: pro-predicate do on the Celtic Fringe!

We know of no rich agreement languages with tense lowering in the morphology.

While this is waffling, it’s still far better than “There are no …”.

Arabic, again, isn’t easy

Dorian’s sister-by-choice Vaneshka told me the same thing yesterday. What’s curious about this judgment is that (a) her native languages are English and Spanish; (b) her mother is a native speaker of Syrian Arabic; (c) she specifically compared it to Mandarin, which she says is (unlike Arabic) at least pronounceable.

(I lost track of what I wanted to link this too, but it’s from Frye’s “Polemical Introduction” aka “The Function of Criticism At the Present Time”.)

[…] the fallacy of what in history is called determinism, where a scholar with a special interest in geography or economics expresses that interest by the rhetorical device of putting his favorite study into a causal relationship with whatever interests him less. Such a method gives one the illusion of explaining one’s subject while studying it, thus wasting no time. It would be easy to compile a long list of such determinisms in criticism, all of them, whether Marxist, Thomist, liberal-humanist, neo-Classical, Freudian, Jungian, or existentialist, substituting a critical attitude for criticism, all proposing, not to find a conceptual framework for criticism within literature, but to attach criticism to one of a miscellany of frameworks outside it. The axioms and postulates of criticism, however, have to grow out of the art it deals with. The first thing the literary critic has to do is to read literature, to make an inductive survey of his own field and let his critical principles shape themselves solely out of his knowledge of that field. Critical principles cannot be taken over ready-made from theology, philosophy, politics, science, or any combination of these.

the first time I’ve encountered an Oxford comma before et al. by anyone other than an American lawyer;

Strike the last word. Anything written or at least published by Americans has them.
languagehat says

November 9, 2023 at 10:06 am

Yup. Anyone who uses the serial comma at all will use it before et al.
Trond Engen says

November 9, 2023 at 1:20 pm

Hey! I have enough headache trying to apply the serial comma already!
David Marjanović says

November 9, 2023 at 2:56 pm

Some scientific journals require the Oxford comma; I’ve never seen it before et al. in any scientific work. Apparently lawyers use it.

However, et al. is practically never used with more than one antecedent in scientific writing. So the issue actually never comes up. Apparently, however, lawyers use the comma before et al. even with one antecedent.
John Cowan says

November 9, 2023 at 3:39 pm

That’s because et al. is thought of as parenthetical. But quite apart from names, Americans write “a tall, florid, and overbearing man named Jaeckel” and “he bit into the apple, chewed up the bite, and spat out the seeds”, both of which involve serial commas in ordinary prose.
David Marjanović says

November 9, 2023 at 5:27 pm

That’s because et al. is thought of as parenthetical.

Then that’s a convention of American lawyers (or something like that) and has nothing to do with the serial comma.

Americans write “a tall, florid, and overbearing man named Jaeckel” and “he bit into the apple, chewed up the bite, and spat out the seeds”, both of which involve serial commas in ordinary prose.

Yes; these are serial commas, and I didn’t dispute that.
David Marjanović says

November 9, 2023 at 6:53 pm

…turns out I came across an example of a scientific paper with a comma before et al. five years ago in this thread. I promptly forgot it entirely because it has remained the only example.
Brett says

November 9, 2023 at 7:55 pm

Ironically, this morning I was proofing a long accepted manuscript for publication in a European-based journal. My coauthor wanted to add an additional citation, to a review article which had a whole bunch of authors. Thinking of this discussion, I was briefly frozen up over whether to put a comma before “et al.“; none of my previous publications in this journal had used the abbreviation. I looked at some other articles they had published, and it seemed they were not actually one hundred percent consistent. However, in my small sample, the comma was usually left out, so I omitted it.
David Marjanović says

November 10, 2023 at 2:59 pm

none of my previous publications in this journal had used the abbreviation

Are publications with three or more authors that rare in your field? Or does this journal spell three authors out and use et al. only from four onwards?

(It could also be like the Zoological Journal of the Linnean Society, which spells three authors out the first time and then reduces the last two to et al.…)
Brett says

November 11, 2023 at 9:30 am

@David Marjanović: I think their style is it has to have at least six authors to get the “et al.” abbreviation. Moreover, I know one prominent group of journals that wants authors lists of up to ten spelled out.
David Marjanović says

November 11, 2023 at 3:55 pm

Blows my mind, and would make many papers in my field longer by half.
Keith Ivey says

November 11, 2023 at 4:33 pm

I’m assuming these are journals that use numbers for the references, so Brett is only talking about how many names are used in the reference list at the end, not in parenthetical references in the text. But a paper can still be mentioned nonparenthetically in the text, and it’s hard to imagine “Cleese, Palin, Jones, Gilliam, Idle, Chapman, Lennon, McCartney, Harrison, and Starr⁴ found that….”
David Marjanović says

November 11, 2023 at 5:36 pm

Oh! Yes, in the references list there are a few journals in my field that want seventh-or-higher authors reduced to et al., too.
Brett says

November 11, 2023 at 10:31 pm

@Keith Ivey: Oh, I totally missed that David Marjanović was talking about inline citations.
Hans says

November 12, 2023 at 9:06 am

“Cleese, Palin, Jones, Gilliam, Idle, Chapman, Lennon, McCartney, Harrison, and Starr⁴ found that….”
… if you find yourself in times of trouble, always look on the bright side of life?
David Marjanović says

November 12, 2023 at 9:50 am

🙂
ktschwarz says

November 16, 2023 at 5:11 pm

John Cowan: “Anything written or at least published by Americans has them.” [serial commas]

If you believe that, evidently you never read any newspapers — or do you not count them as “published”? Non-serial-comma sentences are commonplace there. The first section of the Washington Post that came to hand on Sunday was the book section, with a lead article (on a biography of Willa Cather) whose opening sentence is: “In the early summer of 2022, I flew into Lincoln, Neb., picked up my rental car and drove into a Willa Cather novel.” On the same day, the front-page story on Nord Stream mentions that “… Western energy companies, including from Germany, France and the Netherlands, are partners and invested billions in the project.”

The same goes for the New York Times. The New York Times Manual of Style and Usage (2015) has plenty of rules on commas with titles, “Junior”, years, etc., but no mention of serial commas. The manual itself is full of sentences such as:

Headlines, charts and listings may freely use the abbreviations Bros., Co. and Corp. in the names of companies.
Ordinarily The Times credits the creator or the supplier, or both, for every published photo, map and illustration.
Lord. This British title applies to barons, viscounts, earls and marquesses.

Pullum has actually claimed that “nearly all American publishers” *don’t* require serial commas. No idea what that was based on.
languagehat says

November 16, 2023 at 5:16 pm

Any claim about what “all American publishers” do is false; the style guides are nearly as many as the stars in the sky.
Noetica says

November 16, 2023 at 5:31 pm

Hmm …

Anyway, I always want a serial comma wherever the question could arise. A virtually exceptionless default.
John Cowan says

November 16, 2023 at 5:31 pm

It’s true that as a special case U.S. newspapers don’t use Oxford commas. It is also true, not as a special case, that no U.S. publisher’s style guide requires mango to be spelled “phlgwghppt”.

How to Make a Linguistic Theory.

Comments

Speak Your Mind

Archives

Search

Recent Posts

Recent Comments