Efficient Languages.

December 7, 2018 by languagehat 103 Comments

I think we all know John McWhorter is not to be relied upon when he ventures away from his bailiwick of creole languages, which he is frequently called on to do since he has become the go-to linguistics popularizer, but he does have a pleasant prose style and it’s always fun to argue about his overgeneralizations and sometimes wacky obiter dicta (like the one about the Awful Russian Language). Anyway, herewith from the Atlantic (from 2016, but I appear to have missed it back then) The World’s Most Efficient Languages (“How much do you really need to say to put a sentence together?”):

Just as fish presumably don’t know they’re wet, many English speakers don’t know that the way their language works is just one of endless ways it could have come out. It’s easy to think that what one’s native language puts words to, and how, reflects the fundamentals of reality.

But languages are strikingly different in the level of detail they require a speaker to provide in order to put a sentence together. In English, for example, here’s a simple sentence that comes to my mind for rather specific reasons related to having small children: “The father said ‘Come here!’” This statement specifies that there is a father, that he conducted the action of speaking in the past, and that he indicated the child should approach him at the location “here.” What else would a language need to do?

Well, for a German speaker, more. In “Der Vater sagte ‘Komm her!’”, although it just seems like a variation on the English sentence, more is happening. “Der,” the word for “the,” is a choice among other possibilities: It’s the one used for masculine nouns only. If the sentence were about a mother, it would have to use the feminine die, or if about a girl, the neuter das (for reasons unnecessary to broach here!). The word for “said,” sagte, is marked with a suffix for the third-person singular; if it were “you said,” then it would be sagtest—in English, those forms don’t vary in the past tense. Then, her for “here” means “to here”: In German one must become what feels to an English speaker rather Shakespearean and say “hither” when that’s what is meant. “Here” in the sense of just sitting “here” is a different word, hier.

This German sentence, then, requires you to pay more attention to the genders of people and things, to whether it’s me, you, her, him, us, y’all, or them driving the action. It also requires specifying not just where someone is but whether that person is moving closer or farther away. German is, overall, busier than English, and yet Germans feel their way of putting things is as normal as English speakers feel their way is.

He goes on to consider Mandarin Chinese, Persian, Finnish, and the Maybrat language of New Guinea before winning my heart with a whole paragraph about one of my favorite languages:

If there were a prize for the busiest language, then a language like Kabardian, also known as Circassian and spoken in the Caucasus, would win. In the simple sentence “The men saw me,” the word for “saw” is sǝq’ayǝƛaaɣwǝaɣhaś (pronounced roughly “suck-a-LAGH-a-HESH”). This seems like a majestic monster of a word, and yet despite its air of “supercalifragilisticexpealidocious,” the word for “saw” is every bit as ordinary for Karbadian-speakers as English-speakers’ “saw” is for them. It’s just that Karbadian-speakers have to pack so much more into their version. In sǝq’ayǝƛaaɣwǝaɣhaś, other than the part meaning “see,” there is a bit that reiterates that it’s me who was seen, even though the sentence would include a separate word for “me” elsewhere. Then there are other bits that show that the seeing was most significant to “me” rather than to the men or anyone else; that the seeing was done by more than one person (despite the sentence spelling out elsewhere that it was plural “men” who did the seeing); that this event did not happen in the present; that on top of this, the event happened specifically in the past rather than the future; and finally a bit indicating that the speaker really means what he’s saying.

Go to the link for more languages and his explanation of what it all means; I’ll leave you with my 2007 post Greetings from Kabardia! (which still gives me a chuckle). Thanks, Jack!

Comments

Y says

December 8, 2018 at 12:07 am

sǝq’ayǝƛaaɣwǝaɣhaś (pronounced roughly “suck-a-LAGH-a-HESH”).

Excuse me. I am going to find a fork and stab something with it.
SFReader says

December 8, 2018 at 12:23 am

How about zyshchyfkhuezg’ezezhyfykhenuk’ym?

This Kabardian phrase means “I really can’t make myself meet you again there”

Have no idea how to pronounce it.
AntC says

December 8, 2018 at 12:43 am

when he ventures away from his bailiwick

I suppose I’m not one to sermonise, having no language-related bailiwick at all, but elementary information theory (on which I can claim some expertise) tells you that each signal must be distinguishable from any other possible signal that might be broadcast in that context (linguistic and situational). So taking one utterance in isolation is an entirely bogus exercise.

Languages with more inflection than English typically allow freer word order, where choice of word order carries significance. Then if English wants to express that significance, it needs ‘inefficient’ verbosity. E.g. with his sample sentence, supposing “the children” were already given in the discourse; what was said?:

They were told to “come here” by the father.

Ugh! We’ve had to supplant “said” by a different lexeme “told”, because “were said to” means something completely different; and ‘”come here” was said by the father’ is not idiomatic (why not?). We’ve had to introduce an auxiliary verb because English doesn’t do inflection for the passive. We’ve had to introduce a preposition “by” because English doesn’t do case-inflection.

Can somebody contribute the German equivalent. Is it “busier” than the English passive?

All languages exhibit redundancy if you consider sentences in isolation. What you gain on the swings in one construction, you lose on the roundabouts in another.

And there’s a counterpart for that in artificial (programming) languages: famously terse languages like TECO or Perl will try to make sense of almost any string of symbols (aka ‘line noise’) and do something/anything at your command — even if it means obliterating a decade’s work. Famously verbose languages like COBOL or SQL are so pernickety, it needs mind-numbing precision to get anything done. Somewhere in the middle, programming languages aim to balance terseness against productivity against unambiguity.
Y says

December 8, 2018 at 12:50 am

Huichol tsimekamikakatenixetsihanuyutitsɨɨkɨ́riyekukuyatsitɨiriexiaximekaitsietɨkaku ‘even if they were about to make their dogs fight on the way up’—that probably sounds nice, too.
Stu Clayton says

December 8, 2018 at 12:59 am

This German sentence, then, requires you to pay more attention to the genders of people and things, to whether it’s me, you, her, him, us, y’all, or them driving the action. It also requires specifying not just where someone is but whether that person is moving closer or farther away. German is, overall, busier than English …

Ignorant drivel. The little match girl peering through the glass at the Babeling guests. Fantasies of a monoglot.
Stu Clayton says

December 8, 2018 at 1:05 am

AntC: Can somebody contribute the German equivalent. Is it “busier” than the English passive?

The equivalent of what, exactly ?
AntC says

December 8, 2018 at 2:14 am

@Stu, I meant of the passive English sentence I gave, with “told”. GT sez

Ihnen wurde gesagt, sie sollen vom Vater „hierher kommen“.

At least there’s the same lexeme “sagt”. I see a fair amount of ‘busy’ inflection going on. But is GT getting it right? Translating it back to English, ‘vom’ becomes ‘from’. ‘”come here” from the father’ (apart from being non-idiomatic) means something else altogether.

Redundancy/’busyness’ is effective for resolving/detecting ambiguity, ‘innit?
SFReader says

December 8, 2018 at 2:30 am

is GT getting it right?

No. Maybe something like “Ihnen gesagt wurde vom Vater, dass sie “hierher kommen” sollten” would work.
Stu Clayton says

December 8, 2018 at 4:36 am

SFReader: Yes, almost. “Ihnen wurde vom Vater gesagt, sie sollen herkommen or … daß sie herkommen sollen. Sollten in this situation would be idiosyncratic, because it is used to express “general obligation” in a polite way. Most of the ten commandments start with Du sollst …. They tell you what to do and not to do in no uncertain terms. They were not written by Emily Post.

The difference is not clear-cut, and it’s as much about contextual speech habits as about semantics. But sollten here would sound weird. The father does not say to the child: “you should come over here”. Here’s how a real exchange would go:

F. Komm her, Hans.
C: Nein, ich hasse dich!
F: Komm doch her, Hans. Ich habe keine Zeit für so was.
C: Nein, ich hasse dich für immer!
F: Hans, du sollst herkommen! [not solltest]

The last sentence is well rendered by “Hans, come over here right now!”.
David Eddyshaw says

December 8, 2018 at 7:34 am

I think I may need to start a John McWhorter Defence League … at least he’s not a … Chomskyite.

A lot of the trouble with his article, though, is that it conflates two rather different things, and that his German example is a bad specimen of what I presume his real point is.

Compulsory agreement, grammatical gender etc add pretty much zilch to actual meaning. They’re grammatical filigree (or barnacles, if you are of the Loom of Language school.) If someone declared all German nouns not referring to people to be neuter overnight, the communicative function of German would not be impaired a bit.

However, languages do vary in how much actual information you have to state explicitly. Languages with full-dress evidentials often require that every statement is marked for evidentiality, for example.

In this familiar passage Edward Sapir himself conflates the two, but he does give examples of both kinds of complexity:

In German and in French we are compelled to assign “stone” to a gender category—perhaps the Freudians can tell us why this object is masculine in the one language, feminine in the other; in Chippewa we cannot express ourselves without bringing in the apparently irrelevant fact that a stone is an inanimate object. If we find gender beside the point, the Russians may wonder why we consider it necessary to specify in every case whether a stone, or any other object for that matter, is conceived in a definite or an indefinite manner, why the difference between “the stone” and “a stone” matters. “Stone falls” is good enough for Lenin, as it was good enough for Cicero. And if we find barbarous the neglect of the distinction as to definiteness, the Kwakiutl Indian of British Columbia may sympathize with us but wonder why we do not go a step further and indicate in some way whether the stone is visible or invisible to the speaker at the moment of speaking and whether it is nearest to the speaker, the person addressed, or some third party. “That would no doubt sound fine in Kwakiutl, but we are too busy!” And yet we insist on expressing the singularity of the falling object, where the Kwakiutl Indian, differing from the Chippewa, can generalize and make a statement which would apply equally well to one or several stones. Moreover, he need not specify the time of the fall. The Chinese get on with a minimum of explicit formal statement and content themselves with a frugal “stone fall.”
David Marjanović says

December 8, 2018 at 7:47 am

“Ihnen wurde vom Vater gesagt, sie sollen herkommen or … daß sie herkommen sollen. Sollten in this situation would be idiosyncratic, because it is used to express “general obligation” in a polite way.

…Unless the whole thing is in the past and they’re not still supposed to come here. In that case, sollten is the only option…

…unless you avoid the problem of homophony between passé simple and subjunctive by using the passé composé instead. But that would be very odd in written narration.
David Marjanović says

December 8, 2018 at 8:04 am

Ihnen wurde gesagt, sie sollen vom Vater „hierher kommen“.

If we drop the scare quotes, that would indeed mean “they were told to come here from the father”, i.e. they’re with him already, and someone else told them to come to wherever the narrator is.
Stu Clayton says

December 8, 2018 at 8:18 am

Unless the whole thing is in the past …

I decided to leave that bit out for now, since it would only throw more wood on the fires of those who think in terms of “being compelled to assign things to categories”, knowing no other way to think, lacking competence in the language.

It’s the old concert confrontation: the orchestra players up front, the punters sitting back in the dark, armed with the score and a six-pack of opinion.
David Eddyshaw says

December 8, 2018 at 10:27 am

I believe Edward Sapir was in fact quite competent in German.

I see that McWhorter has a BA in French. It may not be correct to categorise him as a fantasising monoglot.
languagehat says

December 8, 2018 at 10:54 am

It may not be correct to categorise him as a fantasising monoglot.

Of course not, but it’s fun. If he doesn’t want to be slandered, he can start writing more sensibly.
Stu Clayton says

December 8, 2018 at 10:56 am

Then the two men are bilinguals with dubious analytic and explanatory competence. Or they are playing to the monoglot crowd.

# In German and in French we are compelled to assign “stone” to a gender category #

Only a non-speaker would see “compulsion” here. There is no choice, and thus no possibility of compulsion. The average German doesn’t know what “the gender of a word” is, and doesn’t need to.

It takes some circumlocution to get a question about that answered. Das Wort Stroh – sagt man der, die oder das Stroh? And then the person being asked has to think for a second – because he is not used to such a question, and is trying to figure out what you’re getting at. The question is completely weird for him. It’s like getting someone to look at their blind spot.
Stu Clayton says

December 8, 2018 at 11:22 am

Either you say “Das Stroh” and nobody notices, or you say “Der Stroh” and get corrected – unless you’re a touchy foreigner, or everyone else is too polite to correct you. Maybe you don’t care, and brush it off as a barnacle.
John Cowan says

December 8, 2018 at 12:11 pm

die oder das Stroh?

I should think that this question would work better as die Stroh oder das Stroh?, which puts the article into a more familiar frame (before the noun) in both cases. Then it is easy to hear that die Stroh is just wrong.
Stu Clayton says

December 8, 2018 at 1:18 pm

Not a bad idea. But two of three are wrong. Why do you leave out der ? The longer the word, the wordier the question gets.

Sure, the wrong combinations are heard as wrong, but the person being asked doesn’t immediately understand what you’re getting at. He doesn’t think in terms of multiple choice from complete ignorance. That mindset is an obstacle épistémologique for the monoglot questioner.

This question template was the best I came up with in my first 10 years. It worked. Others may find systems they like better in practice. But if they do not use them to acquire competence, it’s all mooty tooty.
elessorn says

December 8, 2018 at 1:19 pm

I find myself in complete caustic agreement with Stu here, but there might be this exculpatory out: that he interacts with foreign languages more as material to be effectively taught (or taught about) than as tools to be effectively used, and so this perspective may be for him a natural one.

When it comes to *teaching* a foreign language–or to be more precise, leading a student to more accurate L2 production–a strong focus on the target langauge’s unique “distinction habits” is quite useful. At least in the learning of languages, I do find that the most incessantly important questions are all variations, not of “what does X mean?” but rather of “how are X and Y different?” (or sometimes, to tease out significant areas of non-overlap, “when are X and Y different?”. (Not that these questions can usually be answered by simply asking so straightforwardly…)

It’s not, in other words, a language feature he’s wrong to find crucial, though this is very much wrong-headed about how language feels to its competent users.

And at least he’s not a Chomskyite. No one would accuse him of not caring about foreign languages.
Stu Clayton says

December 8, 2018 at 1:42 pm

# At least in the learning of languages, I do find that the most incessantly important questions are all variations, not of “what does X mean?” but rather of “how are X and Y different?” (or sometimes, to tease out significant areas of non-overlap, “when are X and Y different?”. #

Very good point ! ANYTHING to get away from this notion of meaning focused on single words. As if each word were a door in an Advent calendar – you open it and find a chocolate signifié.

# a strong focus on the target langauge’s unique “distinction habits” is quite useful #

As that nice man F. Saussure was fond of pointing out.
Stu Clayton says

December 8, 2018 at 2:06 pm

# „nullité du sème en soi“ #

It is music to my ears. Terry Riley, In C.
Christian Weisgerber says

December 8, 2018 at 3:35 pm

They were told to “come here” by the father.

An idiomatic way to phrase this in German would be:
Sie wurden vom Vater herbeigerufen.

That may not be helpful for comparison purposes.

And yes, obviously different languages grammaticalize different things. Many years ago there was a contributor from a Chinese background on sci.lang who railed against the supposed “neutralness” of Esperanto because it is full of weird Indo-Europeanisms like verb tenses and a obligatory singular/plural distinction.

English speakers may struggle with grammatical gender in German; German speakers meanwhile—despite overall very similar looking verb systems—have a hard time with the fine-grained tense-aspect distinctions in English: the distinction between the simple and continuous form is just baffling, and the present perfect looks the same but is used quite differently and in very specific ways that don’t correspond to a single tense in German.
Stu Clayton says

December 8, 2018 at 4:00 pm

That may not be helpful for comparison purposes.

Im Gegenteil. It helps to show that sollen in the example is not needed. It’s a feature of how what the father said is reported, and not necessarily a word used by him. Sie wurden vom Vater herbeigerufen says it all, without invoking the ten commandments.
Stu Clayton says

December 8, 2018 at 4:22 pm

German speakers meanwhile—despite overall very similar looking verb systems—have a hard time with the fine-grained tense-aspect distinctions in English: the distinction between the simple and continuous form is just baffling, and the present perfect looks the same but is used quite differently and in very specific ways that don’t correspond to a single tense in German.

I avoid thinking much in such terms, because I get totally confused and uncertain. I produce correct and idiomatic German and English, that’s what counts for me. Getting there was hell on crutches – until I realized I needed to cast those grammarian crutches aside. Nun gickse ich überall mit (Doktor Faustus).
Etienne says

December 8, 2018 at 4:47 pm

I think Elessorn has hit the nail on the head (“he interacts with foreign languages more as material to be effectively taught (or taught about) than as tools to be effectively used, and so this perspective may be for him a natural one”). I suspect McWhorter would be more aware of the complexities of English had he spent part of his career teaching L2 English in a non-anglophone country.

Indeed, having myself taught English to francophones and French to anglophones, my own sense/impression (it’s nothing more than that!) is that of the two, English is in fact the more difficult language.

Also: Stu and Elessorn may both find the work of the Polish semanticist Anna Wierzbicka to their taste: one of her core concerns relates to offering precise definitions of terms (in any language!) using a ‘hard core’ of what she calls semantic primes or semantic primitives (basically, a set of semantic notions that are genuinely universal: as she or a close associate once wrote somewhere, one can think of these as the atoms of human lexicography, with language-specific words being comparable to complex molecules. And it is precisely because different languages have different semantic molecules in their respective lexicons that there is no one-to-one match between most words in different languages).
Yuval says

December 8, 2018 at 5:48 pm

If someone declared all German nouns not referring to people to be neuter overnight, the communicative function of German would not be impaired a bit.

That’s quite a violent throwing away of all pronouns and coreference functions in a language, methinks.
Stu Clayton says

December 8, 2018 at 5:48 pm

Thanks, Etienne. But the very idea of semantic primitives and molecular lexemes makes me tired. At my age, one is often assailed by the feeling that one has seen it all before. I wish her well, though. Now I think I’ll go back to my spaceship for some shut-eye.
David Eddyshaw says

December 8, 2018 at 6:55 pm

a violent throwing away of all pronouns

No pronouns are harmed in this scenario.
Moa says

December 8, 2018 at 7:41 pm

Actually, there would be no “zhèlǐ” in the Mandarin Chinese sentence, since lái (come) already connotes “towards the speaker”. You could add it, I suppose, but it’s superfluous.

I’m not that sure that throwing away the masculine and the feminine is that great a solution. We did this in Swedish, and it cut down on grammatical genders from three to two; now Swedish has two genders that are kind of neutral genders. I can however, support the idea that the pronouns won’t be harmed. Swedish has he, she, they (sing), it and it: han, hon, hen, den and det. Luckily for us (?)
Y says

December 8, 2018 at 8:21 pm

zyshchyfkhuezg’ezezhyfykhenuk’ym
SFReader, are you transliterating from Cyrillic? Because the same Cyrillic characters represent different sounds in Kabardian and Russian. <щ> is Kabardian /ɕ/, <у> is /w/, etc.
David Marjanović says

December 8, 2018 at 9:17 pm

Indeed, having myself taught English to francophones and French to anglophones, my own sense/impression (it’s nothing more than that!) is that of the two, English is in fact the more difficult language.

That makes sense to me in that:
– while French has the imperfect and the subjunctive*, English has that full-blown aspect system;
– as far as the French sound system is from the global (and even the European) average, the English one is farther away.

* Of course that one is nothing like a tense or an aspect, but it’s another verb form with a usage that is far from obvious if you don’t already know a (western?) Romance language.

Actually, there would be no “zhèlǐ” in the Mandarin Chinese sentence, since lái (come) already connotes “towards the speaker”. You could add it, I suppose, but it’s superfluous.

It’s actually very much like Cpt. Jean-Louc Picâde consistently saying “Come!” instead of “Come in!”.
SFReader says

December 8, 2018 at 11:01 pm

SFReader, are you transliterating from Cyrillic?

OK, transliterated into IPA. Hope it’s easier to pronounce for you

zəɕəfxwazʁazaʒəfəxanwqəm
Bill W. says

December 9, 2018 at 1:29 am

“If there were a prize for the busiest language, then a language like Kabardian, also known as Circassian and spoken in the Caucasus, would win. In the simple sentence “The men saw me,” the word for “saw” is sǝq’ayǝƛaaɣwǝaɣhaś (pronounced roughly “suck-a-LAGH-a-HESH”). ”

If I’m not mistaken (and i usually am), Navaho and other Athabaskan languages have very complex verb structures that require a lot of information to be packed into a single verb form. Also, I believe the Yeniseian Ket language has a similar verb structure, and it was primariy the similarity between the Navaho/Athabaskan and Ket verb templates that led Edward Vajda to suggest there might be a relationship between the two geographically remote language families.

What has happened to his hypothesis? I haven’t seen anything about it recently, but when he first presented it, a number of linguists thought it was at least plausible (though not, of course, Lyle Campbell).
SFReader says

December 9, 2018 at 1:56 am

Went to the Wiki article on Dene-Yeniseian languages and discovered this wonderful example of academic snark:

More recently, a small minority of non-specialists have claimed to be able to link together various language families and language isolates with prefixing verb structures, including (in addition to Yeniseian and Na-Dené) the Northwest Caucasian (Abkhaz-Adygh) and Northeast Caucasian (Nakh-Dagestanian) language families and the Sumerian and Burushaski language isolates—grouped into a widely discredited long-range “Dené-Caucasian family”. So far, this theory has attracted almost no scholarly attention, although it is regularly published in “Mother Tongue” and other pseudoscience-friendly publications.

I wonder who wrote that
Keith Ivey says

December 9, 2018 at 2:16 am

Apparently someone who didn’t want to be connected with it. The edit was made last month by someone who wasn’t logged in.
John Cowan says

December 9, 2018 at 4:03 am

The author was not logged in, and has only made three other comments, all of them denouncing Dene-Caucasian as “a total crock”.
SFReader says

December 9, 2018 at 4:39 am

And according to geolocation tool provided by Wikipedia he lives somewhere on Joe Collins Rd, Monroe, North Carolina.
Stu Clayton says

December 9, 2018 at 5:03 am

On that day he did. IP addresses are frequently reassigned, you know. Now, according to iplocation dot net, he (or she!) has moved to Jacksonville and Charlotte. Perhaps the culprit snarks from a mobile home. Trailer trash !
Elessorn says

December 9, 2018 at 9:06 am

I find myself suspicious of (the intended import of) claims like “whole sentences in a single word!”

Of course it’s true in a literal sense that more synthetic languages will end up with segments of speech that both (1) are indivisible (i.e. “single words”) and (2) code enough information into those inflections that word-sentences occur at non-trivial frequencies.

But that a language so hyper-inflected is actually doing something amazing thereby strikes me as…wrong. Just because most of the semantic load-bearing is done by elements that are also independent words, doesn’t make

ðeniwunnovkeppchraind’cawyu

simple, and it shouldn’t seem simpler if written “then he wouldn’t have kept trying to call you.”

There’s probably some objective reality to some languages being easier to learn than others for non-natives, though learner’s mother-tongue starting-points are a confounding factor. But surely this is no measure of a language’s inherent cognitive complexity, any more than Chinese is “simpler” than English because verbal utterances there are even more analytic.

I bet the (initial) mental-processing burden of Kabardian does outstrip that of, say, Indonesian, but this seems to me mostly a mere memory game. A correct sǝq’ayǝƛaaɣwǝaɣhaś does require accurate, simultaneous, choreographed and successful access to more units of memory than the chicken-eating ayam makan Whorter cites in the article, but this type of recall burden should be trivial for natives.

(I also tend to doubt that a frequency study would actually show most Kabardian sentences to typically be so complicated. Wikipedia lists an impressive list of verbal prefixes:

causative, comitative, reciprocal, reflexive, destination, directional, involuntative, direction of motion, against, benefactive, bypass, through, across, after

But I wonder how pragmatically often multiple prefixes would be naturally motivated to explicit expression simultaneously. My gut feeling is…not too often.)

Either way, I agree with Etienne that “McWhorter would be more aware of the complexities of English had he spent part of his career teaching L2 English in a non-anglophone country.” Let him try to teach something terrifyingly unremarkable like “I wouldn’t’ve had to stand out here waiting for you if you’d managed to remember to call, you know?” and I bet he’ll be singing a different tune.
David Marjanović says

December 9, 2018 at 10:31 am

Regular reminder of this paper on comparative Dené-Caucasian grammar. It’s not just the existence of long prefix chains (also found, to mention an example of obvious geographic relevance, in Ainu), it’s the existence of prefixes with specific forms and specific meanings in specific places in the template.
SFReader says

December 9, 2018 at 10:57 am

In practice, this means that Kabardian has astonishingly large number of words which can only be translated into English as almost full sentences.

Eg, Kabardian dictionary lists separate words with meanings such as “keep someone sitting next to oneself”, “fix something on vertical surface sideways”, “come in running one after another”, “ask someone to be the best man at a wedding”, “turn out to be too low for something” and so on and on.

Bear in mind that these are dictionary forms, so very easily they can be modified even further and would still remain one word, eg, “suddenly drop the pretense of being sympathetic person once again” (hypothetic example – I don’t know if it’s a real word found in real conversations, but it can be formed grammatically from dictionary word with meaning “pretend to be a sympathetic person”)
Y says

December 9, 2018 at 3:19 pm

Regular reminder of this paper on comparative Dené-Caucasian grammar.

Every place I look at in this paper shows unsupported arguments and unintended counter-arguments. It would be too laborious to try to pick anything convincing out of this.
Julian says

December 9, 2018 at 8:35 pm

I’m sceptical. So some languages have complicated noun inflections or complicated agglutinative words. English has a complicated system of verb tense. What do a bunch of isolated factoids like these prove, really?

You would expect the information density of languages to evolve towards the fittest compromise between speakers’ desire to minimise effort and listeners’ need to understand. If it took a minute to say ‘Watch out!’ you might be gored by the angry mammoth before I had finished saying it. If ‘Watch out!’ was conveyed by a single phoneme (which would necessarily have many other meanings as well) you might be gored by the angry mammoth while you were trying to guess what I meant. There will be a sweet spot between unnecessarily verbose and confusingly brief.

Wouldn’t you expect all human groups to be subject to similar forces in this regard? So it’s no surprise that all languages have similar information density. Correct me if I’m wrong, but If you translate a book using the same alphabet and font size, the result might be 10 per cent longer or shorter than the original, but it’s not going to be 50 per cent longer or shorter. And part of the difference may just reflect features of the writing system, such as word division. We should should be comparing the number of phonemes, not the length on paper.
John Cowan says

December 9, 2018 at 9:06 pm

As a simpler example, people who speak path-centric languages (e.g. the Romance languages) can only shake their heads when confronted with English motion sentences like He ran down into the basement. There is, not surprisingly, no verb in Spanish or French for moving downward and inward at the same time: such a thing would be a lusus naturae. English started out as a typically manner-centric Germanic language, but then picked up a whole bunch of path verbs like enter, exit, ascend, descend, arrive, depart from French and Latin.

I think “manner-centric” and “path-centric” are a lot clearer than the usual terms “satellite-framed” and “verb-framed”, which beg the question “What is it that’s framed, the manner or the path?” There are languages that are subject- or object-centric too; I remember a Native American example (but not which language) where you say “He animaled around on a horse.” I forget who I picked up the alternative terminology from.
Giacomo Ponzetto says

December 9, 2018 at 11:08 pm

John Cowan, I genuinely don’t know which Romance languages are path-centric, but I feel sure that corse giù in cantina is perfectly idiomatic in Italian, though I appreciate you might also say scese di corsa in cantina. Then again, you might also say si precipitò in cantina, where I subjectively perceive the verb as primarily manner-centric but the dictionary definition makes it equally path-centric: “getting down or off with great speed.”
Bill W. says

December 10, 2018 at 10:56 am

“Regular reminder of this paper on comparative Dené-Caucasian grammar.” I wasn’t asking about the Dené-Caucasian hypothesis, which is to put it mildly far-fetched, but rather about the Dené-Yeniseian hypothesis, which seemed to pique some interest among serious linguists a number of years ago. And yes, I’m aware that what is most interesting and perhaps persuasive about the possible Dené-Yeniseian connection is the matching of specific slots in the complex verb templates between the two families.
Jim says

December 10, 2018 at 12:43 pm

David,

“It’s actually very much like Cpt. Jean-Louc Picâde consistently saying “Come!” instead of “Come in!”.’

It works in Mandarin because “lai2” means to come to the speaker’s location and can only mean that, so it doesn’t need further specification, where “come” can mean going to the point of interest, which can be anywhere and any time.

However you generally (always?) have to supply the “ni3” (you) i.e “Ni3 lai2” because a bare verb is presumed to have a third person topic or subject.
Bathrobe says

December 13, 2018 at 7:40 am

Moa is correct. There is no need for zhèlǐ (or zhèr) in the Chinese. Where does McWorther get his examples?

I’ve seen Riau ayam makan around. It appears to be a stock example of how shorn of linguistic structures languages can get. In the source I saw, Riau was being cited for its similarity to Classical Chinese.
David Marjanović says

December 13, 2018 at 6:30 pm

However you generally (always?) have to supply the “ni3” (you) i.e “Ni3 lai2” because a bare verb is presumed to have a third person topic or subject.

Wouldn’t you rather use a particle like ba (after the verb) in this case?
dainichi says

December 13, 2018 at 10:03 pm

Danish “her” and “herhen” more and less match German “hier” and “her” respectively, but for some reason Danish allows both “kom her” and “kom herhen”, possibly because the movement is inherent in the verb. That’s not the case for other verbs like “løb her / løb herhen ” (run in this place / run to this place). (Danish has an obsolete “hid” too, cognate to “hither”.)

I call BS on the idea that “all languages are equally complex”. If they are, it’s because of the law of large numbers, as in “with all those sources of complexities, a language would be damn lucky not to have any”, not because they’re inherently necessary, or really efficiencies in disguise or some such. As somebody who works in IT, I see a lot of similarity between languages and IT ecosystems. A new and better system is introduced, and you start migrating users/workflows over, but some stuff stays on the legacy system forever, and you end up with a mess.

My 4-year-old son is my best teacher when it comes to irregularities in language. All the “mistakes” he makes are examples of more logical constructions. I love languages in spite of their imperfections, but imperfections they still are.
David Eddyshaw says

December 14, 2018 at 6:59 am

I call BS on the idea that “all languages are equally complex”

Amen to that.

I’m pretty sure the idea was initially a (praiseworthy) reaction against the stupid idea that low-tech societies have particularly simple languages, and then turned into a dogma, floating free of any actual facts.

Mind you, it’s true enough that all languages are much more complex than any lay person realises, to the extent that there is no fully comprehensive grammar of any language outside the fantasies of Chomskyans. So “law of large numbers” in another sense, too.
David Marjanović says

December 15, 2018 at 2:38 pm

Every place I look at in this paper shows unsupported arguments and unintended counter-arguments. It would be too laborious to try to pick anything convincing out of this.

What about the class prefixes, or the d-prefix in the 3rd position on verbs?

Something else strikes me in Bengtson’s paper. Before he gets to morphology, he packs page after page with illustrations of regular sound correspondences between Basque and other Dené-Caucasian branches. He wants “horn” to begin with an identifiable reflex of a lateral affricate, because it begins with a lateral affricate in Caucasian and with /t/ in isolation but -/lt/ behind prefixes in Burushaski. Problem: the word is adar in Basque. So Bengtson goes creative: he proposes it’s dissimilated from *ardar, and then analyzes the initial vowel as a fossilized prefix (proposed to be cognate with the West Caucasian 3sg possessive a-) which was kept in this word to prevent /r/ or a consonant cluster from surfacing word-initially.

Enter Joseba Lakarra, a mainstream Vascologist who treats Basque as a total isolate. Unlike all earlier work including Bengtson’s, he insists that the roots of all content words must strictly have been CVC at some Pre-Basque stage (e.g. in this paper). Again, adar doesn’t fit, so Lakarra proposes that it’s reduplicated (*da-dar); no attempt is made to explain either the reduplication (a device “unknown in historical Basque”, says the paper on pp. 182–183) or the dissimilation (which has not happened in zezen “bull”, *ze-zen according to the same list on p. 183).

Another word supposedly derived from an ancient reduplication is odol “blood” according to the same list. Bengtson likewise concludes that *dol is the root, proposes straightforward cognates for it elsewhere, and wonders if the o-, also found in some other body part terms and in “uncle” when compared with “aunt”, is cognate with the East Caucasian masculine class prefix *w-. (“Aunt” begins with i-, which Bengtson tentatively compares to the East Caucasian feminine class prefix *j-.)

I wasn’t asking about the Dené-Caucasian hypothesis, which is to put it mildly far-fetched, but rather about the Dené-Yeniseian hypothesis, which seemed to pique some interest among serious linguists a number of years ago.

I know. The trick is that you can’t have one without the other.

Dené-Yeniseian can be interpreted as two claims:

1) Na-Dené and Yeniseian are discoverably related at all;
2) Na-Dené and Yeniseian are each other’s closest relatives.

The first claim doesn’t contradict the Dené-Caucasian hypothesis. The second contradicts it in that the existing versions of Dené-Caucasian propose instead that Yeniseian is closer to Burushaski and Na-Dené is closer to Sino-Tibetan. The second claim could only be tested by comparing it to such alternatives.

The fun part is that Vajda has only ever made the first claim, not the second, has repeatedly stated that he isn’t making the second claim, and has repeatedly stated that he’s instead quite sympathetic to Dené-Caucasian…

Short version: conference presentation by G. Starostin about Dené-Yeniseian.
Long version: first, a long, detailed paper by G. Starostin showing that the morphology, the sound correspondences and the cognates proposed by Vajda aren’t all as solid or exclusive as many people seem to think; then (included), Vajda’s response, about which I refuse to provide spoilers. 🙂
languagehat says

December 15, 2018 at 3:01 pm

This makes me like Vajda: “Anyone who has slogged through my Siberian link article is probably heroic, and those who have taken the considerable time and effort to criticize it are truly admirable.”
David Eddyshaw says

December 15, 2018 at 3:35 pm

I am probably heroic, but not truly admirable.

(Vajda has indeed been exemplary in this matter in both clarity about his claims and fair-mindedness about their status.)

Also, Ket is way cool. Just is, OK?
David Marjanović says

December 15, 2018 at 4:19 pm

It totally is.
Y says

December 16, 2018 at 10:06 pm

What about the class prefixes
Bengtson doesn’t give evidence in the paper that these are indeed class prefixes. It looks like he’s positing fossilized class prefixes in each language family, and so is able to pick only those words in each semantic field which fit the prefix. That’s hard to falsify. Even so, the match between the different families is not very strong.

the d-prefix in the 3rd position on verbs?
I don’t see it, from the discussion on pp. 103–104. There are various preverbal morphemes that look like d- in various language families, but it’s hard to see anything they might have in common.

Note also that he sees Haida as belonging to this phylum, specifically within Na-Dené, whereas Vajda sees no connection between Haida and the rest of Na-Dené or Yeniseian. This suggests that Bengtson is prone to see things which aren’t there.
David Marjanović says

December 17, 2018 at 8:41 am

Bengtson doesn’t give evidence in the paper that these are indeed class prefixes.

That’s what they are in half the East Caucasian languages (apparently uncontroversially reconstructed to their last common ancestor), so as a working hypothesis that’s not bad… and I’ve mentioned the Basque “uncle” & “aunt” words.

I don’t see it, from the discussion on pp. 103–104.

See also pp. 96, 98. But from the discussion, it’s still striking that there’s a similar morph, lexicalized in some families but vaguely transitivizing in others, that occupies such similar positions in languages that can’t have been in contact for thousands of years.

Note also that he sees Haida as belonging to this phylum, specifically within Na-Dené, whereas Vajda sees no connection between Haida and the rest of Na-Dené or Yeniseian.

That would matter if Bengtson used a Proto-ND reconstruction that was based (in part) on Haida. He doesn’t. Nobody does, because no PND reconstruction exists at all – people are still working on Proto-Athabaskan… If you treat Haida as a completely independent branch of Dené-Caucasian, nothing in the structure of Bengtson’s arguments changes. Just change “Na-Dené” to “Haida” in a few places.

BTW, how closely has Vajda looked at Haida?

This suggests that Bengtson is prone to see things which aren’t there.

Or that he follows Pinnow’s large body of work on Haida. I couldn’t tell, I haven’t seen it. 😐

Anyway. The mammoth in the room is Sino-Tibetan. Quite contrary to previous perceptions including Bengtson’s, Guillaume Jacques has been collecting evidence that parts of the polysynthetic morphology of the Rgyalrong languages in Sichuan and the Kiranti languages in Nepal are cognate, giving hope that one day it’ll be possible to compare PST morphology to that of other languages on a large scale.
languagehat says

December 17, 2018 at 9:55 am

Guillaume Jacques and Rgyalrongic languages previously at LH: 2015, 2016 (only one comment!).
J.W. Brewer says

December 17, 2018 at 3:49 pm

I’m not sure we have a sufficiently well-defined and uncontroversial measure of “complexity” for the “all languages are equally complex” shibboleth to even be evaluated as an empirical claim. But that’s maybe one step back from the problem that “efficiency” in a language is even harder to measure, because presumably what’s going on is a bunch of trade-offs between a language being unnecessarily (and thus inefficiently) complex in one direction versus being excessively (and thus inefficiently) vague/ambiguous in the other direction (and there are probably multiple axes along which such tradeoffs and balances are constantly being made and adjusted), and you probably can’t quantify any of the desiderata involved in the tradeoffs well enough to have some formula tell you exactly which set of tradeoffs is the optimal solution. What may be true is that there’s sort of a fuzzy zone of tolerability such that languages that wander outside of that zone, in any direction, will be under signficant pressure to evolve in whatever direction will get them back into the zone or risk having their speakers shift over to some more tolerable alternative language that may be at hand. I would not *necessarily* assume, however, that the boundaries of that good-enough zone are themselves fixed or universal because what sort of inefficiencies are or aren’t sufficiently tolerable-in-practice may vary substantially with social/cultural circumstances. And obviously languages that are within that zone still evolve over time because they can’t help themselves, but they may be evolving in a random-walk kind of way that is neither improving their overall efficiency nor degrading it to the point of triggering an outside-the-zone crisis.
Y says

December 17, 2018 at 6:09 pm

In Basque, the d- is one of several “ancient markers of verbal categories” (TAM markers), maybe, along with z-, l-, b-, and zero. Proto-Athabaskan has four “classifiers”, zero, *dǝ, *ɬ, *ɬǝ. Haida has ta- “transitive”. Burushaski has something like -d-…-s- whose meaning is not explained and which I’m not going to look up. Sino-Tibetan has *d- “described as ‘directive'” (waht’s that?), but Nung, one of hundreds of ST languages, has dǝ- “causative”.

What do any of them have to do with each other, other than being verbal prefixes with a coronal stop in them?

Reading technical papers requires concentration and work. Good papers reward you for every hour spent chewing a page. This kind of paper does not. When everywhere you look all you find is sloppiness and errors, it’s discouraging.
David Marjanović says

December 19, 2018 at 8:47 am

What do any of them have to do with each other, other than being verbal prefixes with a coronal stop in them?

Their position in the template, and the very fact that so many of them are fossilized.

Burushaski has something like -d-…-s- whose meaning is not explained and which I’m not going to look up.

«The [Burushaski] D-prefix or preverb is a lexicalized, often discontinuous,
part of the stem’s lexical entry. It occupies position –3 in the
Brsk verb template.»^138

Evidently there is no meaning left, so that the prefixes at positions –2 and –1 could be called infixes except for the fact that most verbs don’t have this /d/.

everywhere you look all you find is sloppiness and errors

So far you’ve mentioned one potential case of either: the inconsequential inclusion of Haida in Na-Dené. What others can you find?

In the meantime I’ll look up the verb templates of two Rgyalrong languages, Japhug and Zbu. Maybe I’ll find a Kiranti one, too. I agree that Bengtson was fishing around rather randomly in the Sino-Tibetan pool – that’s all anyone could really do ten years ago.
dainichi says

December 19, 2018 at 7:58 pm

> I’m pretty sure the idea was initially a (praiseworthy) reaction against the stupid idea that low-tech societies have particularly simple languages

Right. With “simple” presumably meaning “lacking expressive power”, not “requiring little cognitive overhead” (if that’s the right expression). But if you can get full expressive power with little cognitive overhead, what’s not to like about that?
David Eddyshaw says

December 19, 2018 at 8:28 pm

But if you can get full expressive power with little cognitive overhead, what’s not to like about that?

Nothing at all; but it isn’t even true that low-tech societies characteristically have languages involving little cognitive overhead; it there’s a correlation of any kind it seems to go the other way, if anything.

This, if true (which it may well not be), would at least be explicable: high-tech societies tend to have big languages, which might well have had their rough edges knocked off serially by being very often learnt as L2s in the course of gobbling up other languages in the course of their expansion.
J.W. Brewer says

December 19, 2018 at 8:39 pm

David E’s point makes me remember that the one conventional-wisdom exception to the quasi-religious folk wisdom of the academic-linguistics tribe that all languages are equally complex is actually that pidgins and similar contact languages, in their initial pristine form where they are nobody’s L1, are typically unusually structurally simple – but with regression to the egalitarian mean once creolization sets in in subsequent generations. But I wonder if such “simple” pidgins in fact have low cognitive overhead, because it seems like getting expressive power for anything outside a very limited number of stock topics in such a pidgin is a time-consuming pain in the ass (as you build elaborate compound structures with your limited lexeme inventory in order to draw the distinctions you are trying to draw etc), and indeed that inefficient and cognitively-burdensome pain-in-the-assness is the most plausible driver of the loss of that pristine “simplicity” during creolization.
David Eddyshaw says

December 19, 2018 at 9:07 pm

Indeed, it seems very likely that there probably is a trade-off in complexity at least in that context; transparent elaborate phrases get telescoped to more opaque but less sequipedalian expressions so that you don’t end up spending all day talking about how to get your next meal or overthrow the colonial oppressor but have time to actually do it before bedtime.
David Eddyshaw says

December 19, 2018 at 9:21 pm

Lingala, which is not exactly a creole (I suppose) but is certainly fairly creole-ish, has laudibly abandoned all that boring Bantu noun class agreement stuff, incorporated object pronouns etc etc, but on the other hand has elaborated a whole set of tense/aspect/whatever distinctions in the verb phrase which make English look as aspectually impoverished as German.
David Marjanović says

December 20, 2018 at 7:10 am

I read up on Zbu and went, of course, mad from the Lovecraftian revelation.

So, uh, in Zbu, in other Rgyalrong languages (at least Japhug, Stau and Tshobdun), and in Tangut, there are TAM categories that require a set of paired prefixes that occurs a certain position in the Rgyalrong verb template. Which one is chosen from a pair depends on TAM. Which pair is chosen from the set of pairs depends:

Verbs of motion, “put”, “look” and the like pick them by meaning, because these “orientational prefix” pairs each have one. (Rgyalrongic: “up”, “down”, “upstream”, “downstream”, and two directions more or less arbitrarily labeled “east” and “west”. “Down” and “downstream” have merged in Zbu ultimately because the original “east” followed by “3sg” became homophonous with the infinitive prefix. Tangut had a somewhat different list: “up”, “down”, “closer”, “away”, “centrifugal”, “centripetal”, and… “neutral”.)

Other verbs always pick the same one: for example, “eat” takes “up” at least in Zbu and Japhug, “stick” takes “east” (perhaps that’s related to positive qualities like “long” taking “east” and negative ones like “short” taking “west”… but I’m speculating here, and I did mention I’ve gone mad!), “be right” takes “down”, “inspire” takes “upstream”. These “lexical orientations” of different meanings are often better conserved in Rgyalrongic than the verb roots themselves.

This is sometimes used as a derivational mechanism: there’s a verb root that means “pour” with “down”, “fall (of rain/snow)” with “west” (not with “down”), and “swear” with “up”.

And on top of that, some TAM categories always take the same one of these markers and do not allow actual or lexical orientations to be expressed. The cessation of a state is expressed by “west”, for example.

“Up” begins with /t/ all over Rgyalrong. /d/ exists, but would not be expected, because the whole series of plain voiced plosives is rare (though /d/ least so) and of recent vintage; that said, there’s a common series of prenasalized voiced plosives, which is what earlier plain voiced plosives seem to have become, IIRC.

These prefixes are not in position –3 or even –4, though. They’re in –5. Position –3 is occupied by person markers, one of which, funnily enough, is a 2nd-person marker that also begins with /t/.

In Tangut, “up” did not begin with /t/ or /d/, but with /j/. Both “centrifugal” and “centripetal” began with /d/, though. I don’t think Tangut was polysynthetic enough to have a clear template, but I don’t know.

So, on the plus side, we have here a set of prefixes that combine TAM marking with other stuff and are fossilized on many verbs despite not being in position –1 and despite the fact that the occupants of –1 through –4 are not also fossilized. On the minus side, the position in the template and the phonological shapes are merely suggestive at this point, and will remain so as long as a lot more Sino-Tibetan reconstruction hasn’t taken place.
David Marjanović says

December 20, 2018 at 12:14 pm

The Kiranti language Khaling is different: there is only one prefix position in the verb template, as opposed to seven suffix positions. The fourth is where the past marker /t/ goes.

Warning if you click through: ablaut on vowels, root-final consonants and tones – all for the same root – may be hazardous to your mental health by the time you reach table 5 less than 1/8 of the way through the paper. Fortunately, the verb template is table 3.
John Cowan says

December 20, 2018 at 1:53 pm

Warning if you click through

Small wonder that the linguistic ancestors of the Chinese turned their back on all that craziness and made their language “uninflected, positional, and flat”. The language of Center in Glory Road, for which that description was coined, is a pretty good representative of the type. Here’s just a bit, literally rendered: “Stop! Danger you! Other old bald Rufo (?) top compculturist. Wisdom egg-sperm-egg. Five-minutes. Liar and/or fool. Wisdom? Catastrophe!”
David Marjanović says

December 20, 2018 at 2:58 pm

Haaa! I found the Japhug template! It’s not in the thesis on its phonology & morphology, but table 1 in this paper. Guess what, the orientational prefixes are in position –10, out of 14 prefix and 3 suffix positions.
languagehat says

December 20, 2018 at 3:54 pm

Poor David M — I knew him before he descended into gibbering madness and started raving about tables in the Necronomicon…
David Eddyshaw says

December 20, 2018 at 3:56 pm

It all comes from too much polysynthesis.
languagehat says

December 20, 2018 at 4:13 pm

I don’t know what you’re trying to say, but all I hear is “Ph’nglui mglw’nafh Cthulhu R’lyeh wgah’nagl fhtagn.”
January First-of-May says

December 20, 2018 at 5:22 pm

Now I wonder whether Cthuvian is polysynthetic… I do vaguely recall having seen a grammar.
Brett says

December 20, 2018 at 5:27 pm

Cthuvian ipsum generator
dainichi says

December 20, 2018 at 10:19 pm

> high-tech societies tend to have big languages, which might well have had their rough edges knocked off serially by being very often learnt as L2s in the course of gobbling up other languages in the course of their expansion.

I think this could also go the other way. High-tech societies tend to import a lot of stuff, giving them an often unnecessarily big lexicon (or maybe I should say morpheme inventory). For example, low-tech societies with small languages would probably be less likely to have many synonymous morphemes (like folk, people/popul-, demo-). This is even the case for syntax sometimes, with postpositional adjectives like “general” and “emeritus” (although these are limited in distribution).
David Eddyshaw says

December 21, 2018 at 6:16 am

I agree it would be pretty easy to concoct a just-so story retrospectively to “explain” whatever the reality might happen to be, and I’m not myself committed to the idea that there is any correlation positive or negative that needs to be explained. Still, for what it’s worth, I was thinking not of lexicon but of morphosyntactic arbitrariness.

Come to think of it, excluding lexicon from one’s measure of complexity is not a theoretically neutral decision …

I can think of cases where a big language has complicated its syntax by absorption of other language elements, though the instances that occur to me offhand involve not gobbled-up substrate languages but borrowing from prestige languages, like your examples, and like Greek constructions in Latin, Arabic constructions in Persian and Chinese locutions in Japanese. I think you could make a case that none of these examples has fundamentally affected the structure of the borrowing language, but the argument risks getting a bit circular, I guess.

Languages can most definitely complicate their phoneme inventory by borrowing; to pick an example from my favourite language, in Kusaal [h] is at some neat abstract level just one of the allophones of /s/, but the loanword hali “up to, as far as, even, a lot” is deeply embedded in the language as the normal way of expressing all kinds of things, and it is never realised with [s]. (Though again, that’s a borrowing from a prestige language.)

I gather, too, that efforts to describe the vowel systems of Northwest Caucasian languages as consisting of just one or two phonemes are undermined by the speakers’ perfect readiness to incorporate Turkish and Persian and Arabic loanwords into their languages.
David Eddyshaw says

December 21, 2018 at 6:22 am

The Japhug Template must surely be mentioned in the HPL oeuvre somewhere?
SFReader says

December 21, 2018 at 6:34 am

all I hear is “Ph’nglui mglw’nafh Cthulhu R’lyeh wgah’nagl fhtagn.”

Somewhere in those links one can encounter phrases like “Gdong-brgyad Japhug Rgyalrong” which is apparently in English, believe it or not (as in phrase “dialect of Gdong-brgyad Japhug Rgyalrong”).
languagehat says

December 21, 2018 at 9:56 am

Не случайно, как говорится.
Trond Engen says

December 21, 2018 at 9:06 pm

Inbetween the washing and baking and giftwrapping I’ve been trying to read the Rgyalrong papers. I want to ask — maybe influenced by the French glossing — if these agglutinative systems could be young (without defining “young”). The common template would be common inherited syntax, discrepancies in the template and non-cognacy of pre- and suffixes would be due to dialectal variation in syntax and lexicon, and the unclear internal affiliations would be due to the in-situ genesis of each language after the larger speech community dissolved. The Arpitan of Sino-Tibetan or something.
David Eddyshaw says

December 22, 2018 at 12:13 am

There’s a parallel argument with the agglutinative verb morphology of Bantu; there’s a strong tendency for the Bantu tail to wag the dog when it comes to Niger-Congo comparative work, and a corresponding propensity to project Bantu agglutination of preverbal pronominal and other elements back to Niger-Congo. Tom Güldemann (arguing against Larry Hyman) somewhere (can’t locate the paper just now) points out that a difficulty with this is that the Proto-Bantu system is just too regular and transparent for it to be plausible that it can be pushed back that far; if it was really that ancient the morphemes ought to have got thoroughly mushed together and mixed up by now.

(I’m biased by my West African perspective: I prayed Fela Kuti in aid of my argument a while back. All together now: No agreement today. No agreement tomorrow … A no go giri …)
John Cowan says

December 22, 2018 at 2:53 am

“Mutton yesterday, mutton today, and blimey if it don’t look like mutton tomorrer. […] Never a blinking bit of manflesh have we had for long enough.”

inspired of course by

“Jam yesterday, and jam tomorrow, but never jam today.”
David Marjanović says

December 22, 2018 at 6:14 am

I want to ask — maybe influenced by the French glossing — if these agglutinative systems could be young (without defining “young”).

In part, yes. In part, no.

if it was really that ancient the morphemes ought to have got thoroughly mushed together and mixed up by now

This is why the Moscow School expects morphology to be useless for most language families that are noticeably older than IE, and concentrates on basic vocabulary instead. I do think they went a little too far with that pessimism.
Bathrobe says

February 17, 2020 at 8:39 pm

I recently came across this article again recently. And it seems worse than I originally thought.

‘The father’ is a kind of set expression that anyone mostly familiar with European languages would come up with. But it’s actually a fairly artificial sentence, a sentence in limbo without context. What does it mean? ‘The father’ as opposed to ‘the mother’? Whose father? The father of the child? Do we really talk about ‘the father’ without context? In Chinese you would have to be clearer. 父亲 is actually from Japanese. It’s a very formal word for ‘father’. In olden days they used 爹. Nowadays they use 爸 or 爸爸. And it’s used for a particular person’s father — yours (‘Your Dad’s calling you’) or ours (‘Dad is calling you’). Isn’t it likely that someone in Chinese would say 孩子的爸爸? (It’d be best to check with a native speaker.)

“The father said ‘Come here'” is a rather poor sentence for demonstrating the ways that some languages incorporate more information than others.
David Eddyshaw says

February 17, 2020 at 9:47 pm

Good point. In fact, there are a great many languages in which you can’t say “the father” like that at all, because the word for “father” is obligatorily possessed.

Kusaal is one, in fact: I just searched for saam “father” in the 2016 Bible: every example has a possessor (all the apparent exceptions are not sàam “father” but sáam “strangers.”)
languagehat says

February 17, 2020 at 10:46 pm

Excellent point. Why do people feel the need to generalize about languages?
Bathrobe says

February 17, 2020 at 11:45 pm

AntC made this point right at the start:

taking one utterance in isolation is an entirely bogus exercise.
Stu Clayton says

February 17, 2020 at 11:48 pm

Why do people feel the need to generalize about people generalizing ? They could just count backwards from 100 instead.
David Eddyshaw says

February 18, 2020 at 5:16 am

All generalisations are false.
Stu Clayton says

February 18, 2020 at 6:23 am

I would say that 95% of generalisations are false. That’s a reasonable position.
John Cowan says

February 18, 2020 at 9:44 am

Avoid generalizations: they invariably lead to bad reasoning.
January First-of-May says

February 18, 2020 at 10:10 am

Hopefully relevant XKCD.
David Eddyshaw says

February 18, 2020 at 12:12 pm

For posterity, I must record that I was wrong about there being no examples of possessorless fathers in the Kusaal Bible: for example, Proverbs 3:12 has

nwɛnɛ saamme tɛɛgid bikanɛ ka o ya’am kpɛn’ tʋbir si’em la

resemble:FOCUS father:SG:NOMINALISER pull:IPFV [child-that.SG:NOMINALISER and his gall enter] ear:SG how the

“as a father disciplines the child he loves”

Still, there are a lot of languages in which family-relationship words can’t occur without possessors.
David Eddyshaw says

February 18, 2020 at 12:44 pm

I don’t think this is the Güldemann paper I was trying to remember before, but it addresses much the same issues regarding the trap of blithely projecting Proto-Bantu structures back into Proto-Niger-Congo:

https://www.researchgate.net/publication/300471822_Proto-Bantu_and_Proto-Niger-Congo_Macro-areal_Typology_and_Linguistic_Reconstruction
John Cowan says

February 18, 2020 at 2:30 pm

There is a language (whose name I can’t remember, but I am guessing it is South Pacific) in which body parts are alienable: when attached, they take a obligatory possessor, but when detached, not. So my hand is ‘my-hand’ as long as I have it, but if it is amputated it can be just ‘a-hand’.

My favorite example of (in)alienable possession: in Tahitian, the Book of Jeremiah is inalienably possessed because he wrote it, but the Book of Joshua is only alienably possessed, because it’s about him but he didn’t write it. Of course the notion that a book is a different book altogether if its subject is replaced does not apply to oral narratives, where a story about something Twain said may easily become a story about something Shaw said.
David Eddyshaw says

February 18, 2020 at 3:18 pm

Algongquian languages deal with the problem of unattached obligatorily possessed things by having a special “neutral” possessor prefix: thus in Arapaho: nó’oo3 “my leg”, hó’oo3 “your leg”, hí’oo3 “his/her leg”, wó’oo3 “someone’s leg.”
Stu Clayton says

February 18, 2020 at 4:34 pm

What does “unattached” mean here ? Are your examples about amputated legs ? How would you say “stolen wallet” ? Can a wallet be less obligatorily possessed than a leg ?

I will hazard the generalization that it all makes sense once you know what it all means.
David Eddyshaw says

February 18, 2020 at 5:07 pm

@Stu:

You got me. I was being deliberately vague. Though in this case, I think the Arapaho wó’oo3 does duty both for disembodied legs and legs attached to nobody in particular.

“Inalienable” possession and “obligatory” possession don’t have to be the same thing in a language, though, as you quite properly imply.
Stu Clayton says

February 18, 2020 at 5:25 pm

as you quite properly imply

@David: now, how did I manage to do that ? I hate it when I seem to say more than I know. Great expectations of further wisdom are raised that then leave me scrambling to save face.
David Eddyshaw says

February 18, 2020 at 5:37 pm

Belep (Austronesian, from New Caledonia) has two possessive constructions: dependent, where the possessor just follows the possessed noun, and independent, where the “genitive” particle, which actually belongs to the following possessor, is enclitic on the preceding possessed noun. There are four classes of noun:

1. Inalienably possessed, only occur in the dependent construction, have no free forms: father, charity, egg, skirt, gesture, vine and 200 more. If the possessor is unspecified they take a third-person pronoun possessor.

2. Inalienably possessed, only occur in the dependent construction, but have free forms: blood, story, sadness, louse, house, boat … the possessed form is often very much modified compared with the free form.

3. Can occur in either construction depending on whether the speaker is currently thinking of them as inalienable: dirt, day, home, language, barrier …

4. Boring nouns. Alienable; only usable in the independent construction.

Simplicity itself.