The Origin and Evolution of Word Order.

OP Tipping at Wordorigins.org posted Murray Gell-Mann and Merritt Ruhlen’s 2011 PNAS article The origin and evolution of word order, whose abstract reads:

Recent work in comparative linguistics suggests that all, or almost all, attested human languages may derive from a single earlier language. If that is so, then this language—like nearly all extant languages—most likely had a basic ordering of the subject (S), verb (V), and object (O) in a declarative sentence of the type “the man (S) killed (V) the bear (O).” When one compares the distribution of the existing structural types with the putative phylogenetic tree of human languages, four conclusions may be drawn. (i) The word order in the ancestral language was SOV. (ii) Except for cases of diffusion, the direction of syntactic change, when it occurs, has been for the most part SOV > SVO and, beyond that, SVO > VSO/VOS with a subsequent reversion to SVO occurring occasionally. Reversion to SOV occurs only through diffusion. (iii) Diffusion, although important, is not the dominant process in the evolution of word order. (iv) The two extremely rare word orders (OVS and OSV) derive directly from SOV.

OP says “my question is, is the basic premise correct? Are there no known cases of SOV evolving from other languages?” I responded:

I’m morally certain it’s bullshit, like all statements about “the first human language,” but I’ll post it at LH and see what people have to say.

So: what say you?

Comments

  1. J.W. Brewer says

    Given the double hedging of “for the most part” and “except in cases of diffusion,” I’m not sure that a single well-documented counterexample is worth all that much, especially given the likely blurriness of “how can you be sure it was/wasn’t diffusion?” in many situations.

  2. David Eddyshaw says

    The very first line of the abstract is untrue; after that, it doesn’t improve.

    Any “linguistics” paper that cites “Amerind” as if it were real is garbage.

    There’s no actual evidence for their propositions about word order change other than SOV to SVO: it’s pure conjecture. There is no reason to suppose that Insular Celtic passed through a SVO stage on the way to VSO.

    The stuff about “Niger-Khordofanian” is bollocks on several levels. If Mande is part of a “Niger-Khordofanian” at all (for which there is no good evidence whatsoever) there is no reason to suppose that it is closer to “Niger-Congo” than those parts of Khordofanian which are conjectured to be related to “Niger-Congo”, which do at least have noun classes, unlike Mande. Even if it were “coordinate with all the rest of Niger-Congo”, the fact that Mande is SOV* does not trump the fact that “Niger-Congo” mostly isn’t, unless you have already assumed that SVO hardly ever becomes SOV, the very thing these charlatans are pretending to demonstrate. As the Mande languages certainly are all related, a change to the current (typologically very unusual) word order only had to occur once, in the Mande protolanguage. On the other hand, there are quite a number of SOV Volta-Congo languages, not closely related to each other, like Baaatonum vs Senoufo. (Talking of Senoufo, it seems pretty clear that they owe their unusual word order to contact with … Mande.)

    However, criticism of this sort of paper is irrelevant. If the authors were concerned with evidence and logic they wouldn’t have written it in the first place. They have crafted their own little hermetically sealed intellectual bubble, entirely impervious to scientific method.

    Pity about Gell-Mann.

    * It isn’t “normal” SOV in reality: adverbs and even indirect objects always follow the verb.

  3. Thanks, that’s pretty much what I expected.

  4. The question that raises for me is the converse: are there any known cases of SOV > SVO for which diffusion can be ruled out as an explanation? I guess Latin > Romance is usually taken as endogenous, but I don’t see how you could rule out, say, influence from Celtic. That’s without even getting in to the problems of identifying “basic word order” in free word order languages, where information status is typically more relevant to word order than case, or the way auxiliaries/TAM markers mess up this typology.

    Edit: The basic problem is that ruling out diffusion is generally quite difficult even where we have written records, and utterly impossible for most proto-languages.

  5. J.W. Brewer says

    There are two different issues here about evidence. One is whether the authors have handled available evidence thoughtfully and dispassionately. The other is the amount and/or quality of evidence available. What percentage of extant human languages as of let’s say 2,000 years ago (or what percentage of 2,000-years-before-present ancestors of the current inventory of languages, to be more precise in terms of enabling a comparison of what has changed and in what direction) are sufficiently well-documented that we have well-justified confidence that we *know* what their default/most-common word order was? Take a language without current issue: you can find someone on the internet asserting that Etruscan shifted from SOV to SVO (and occasionally OVS when a deictic pronoun was present). We have as I understand it a corpus of several thousand short inscriptions in Etruscan but nothing like substantial surviving chunks of running prose or extended dialogue. But it seems plausible enough to me that short inscriptions are a sufficiently distinct genre that you can’t just assume they are representative of syntax over the whole range of situations the language was actually spoken in. No doubt there are plenty of conjecturally reconstructed proto-X languages where the reconstructors have a conjecture about the default word order despite the fact that there’s no manuscript or epigraphic evidence at all, but how confident should an outsider be as to the reliability of that sort of conjecture, especially given the possible feedback loop of someone reconstructing proto-A thinking “absent contrary evidence we assume it worked the way we assume things usually work” a second someone reconstructing proto-B then treating that as if it were affirmative evidence bolstering the plausibility of the background assumption about how things usually work?

    Reconstructors of Proto-Algonquin (or -quian, if you like) are apparently not much prone to thinking about word order, at least according to the first seemingly-competent discussion I googled up, and there must be other Proto-X’s in the same boat.

  6. J.W. BREWER says

    Loosely related: a lot of the popularizing treatments of comparative/historical linguistics you could find in public libraries when I were a lad (reflecting a simplification of a scholarly consensus of X decades previously) offered a vision of a Great Cycle of Life in which agglutinative languages tended to evolve into inflected (“fusional,” some of the kids apparently now say) ones, which in turn tended to evolve into isolating languages, which in turn tended to evolve into agglutinative languages so that the cycle could repeat. I don’t think there was any actually attested example of a full round trip in the history of a particular language or language family, and such a round trip would have taken many millennia, but it wasn’t that implausible as just-so stories go. But with that sort of cycle you can retroject it all the way back to Proto-World easier than you can conclude at what point things stood in the cycle when Proto-World was spoken. Maybe there’s an even longer-time-depth cycle for word order and we haven’t spent enough time in obscure enough parts of the world to catch a glimpse of VOS cycling back to SOV?

    Note that if you assume Proto-World was not literally the “first” extant human language but was itself the result of tens of thousands of years of prior change and is only fortuitously the ancestor of all currently extant languages due to some prehistoric evolutionary bottleneck causing the other languages contemporaneous with it to leave no living descendants, Proto-World could have been at any arbitrary and even statistically improbable point in such a cycle when it began to diverge into daughter languages that in turn have currently-living issue.

  7. David Eddyshaw says

    are there any known cases of SOV > SVO for which diffusion can be ruled out as an explanation

    Similarly, their statement that the change SVO -> SOV only ever occurs by diffusion is a virtually unfalsifiable get-out-of-jail-free card.

    To disprove it, you would need to find a population whose language had changed from SVO to SOV in the absence of any contact with a SOV language. This is unlikely to happen for two reasons:

    (a) only a tiny proportion of the worlds languages have been documented for long enough to see basic word order changes (speculation about word-order in protolanguages leads simply to circular arguments); and

    (b) getting on for half the world’s languages are SOV, so demanding that your experimental population has never been in contact with one is a pretty limiting requirement.

    The idea that inheritance and diffusion are the only two possible sources of word order is also clearly false. For example, I think you could make a good argument that SOV languages are liable to become SVO if they lose case marking, for reasons of disambiguation; this would particularly apply when (as often, and in Latin in particular), the requirement that the verb comes last is more rigid than other ordering critera.

  8. John Emerson says

    Didn’t language develop in one fell swoop? And what’s before the beginning?

  9. David Eddyshaw says

    BSL is apparently OSV.

    https://en.m.wikipedia.org/wiki/British_Sign_Language

    Presumably this was the result of diffusion from Nadëb.

  10. J.W. Brewer says

    But David E.’s “For example” seems consistent with the hypothesis – i.e. it offers a plausible functional account of *why* SOV-to-SVO transition would be a common phenomenon absent “diffusion.” If there’s not an equal-and-opposite plausible functional account of why SVO-to-SOV transition could and does occur absent “diffusion,” then Bob’s your uncle. (Or should it be “Your uncle’s Bob”?)

  11. J.W. Brewer says

    I find in a corner of the internet apparently inhabited by conlang enthusiasts the interesting question “Is it possible to have an ergative/absolutive, topic-comment language?” None of the three responses gave an actual example of a natural language fitting the bill, but I feel like I should offer a modest prize in memory of Professor Gell-Mann for the first compelling statistical demonstration that Proto-World itself was, in fact, such a language.

  12. David Eddyshaw says

    @JWB:

    Fair point. To come clean, I suspect that there actually is a tendency for SOV languages to become SVO; however, given that SOV languages remain extremely common, and leaving aside Ruhlenoid fantasies, there must be some processes replenishing the supply.

    Admittedly, the mechanism I suggested only really works if the language was (like Latin) not rigidly SOV, though. Mande languages, which have maintained SOV for millennia by the look of it, are also famously rigid in word order; moreover, they have tense/aspect particles between the subject and the object, so ambiguity is never really going to be an issue anyway.

    Speculating wildly, I would guess that the peculiar Mande word order (where only a direct object precedes the verb, and it moreover intervenes between the verb and its tense-aspect-mood-polarity markers) went back historically to finite structures arising from copula/light-verb followed by an object + verbal noun compound. (Mandinka actually still has noun incorporation.)

    A bit like (literary) Welsh

    Yr wyf yn dy garu
    assertive.particle I.am predicative.particle your love
    “I love you.”

    It would be entirely illegitimate to take this as evidence for a basic OV word order in Welsh …

    Mandarin seems to be becoming more SOV-like; this seems to get attributed to Mongolian and/or Manchu influence, but in view of the proportion of Chinese speakers to speakers of either of those languages, this has always seemed something of a stretch to me. In reality, it seems more like an ongoing language-internal tendency to bring objects in front of the verb using constructions which remind me of of serial verbs.

  13. Gell-Mann was known for being increasingly kooky (and disconnected from the physics community) for a long time before his death.

  14. English has OV word order (more precisely verb-final) in nominalized compounds like fruit picker. Is anything inherent keeping it from back-forming constructions like “I fruit-picked” and generalizing them to a regular SOV word order?

  15. P. S. Ruhlen passed away last January.

  16. @David Eddyshaw, specifically about the first line: I do not see a problem with this direction of search. I do not think that words “single earlier” should trigger alarms. When you have a distribution and rules of directionality you can find somethign interesting. Even when the rules are statistical (a>b is more likely than b>a) rather than absolute (b>a is impossible). Basically it is how we learn history of the universe.

    It also make sense to consider what conclusions you can derive from your data within various toy models. E.g. “assume that all langauges come from a single PL and the rules never changed. Then the current distribution can only be obtained from such and such initial state”. I do not see a serious problem here apart of maybe that it is a good idea to say that your toy model is a toy model, that your proto-langauge is virtual and than we should try other toy models too.

    It is the same as studying the big bang. I do not think that Lamarckism or the phlogiston theory or impetus are “bad”: a theory does not have to be “right” to be worthy, asking questions, developing methods and learning about obstacles and receiving criticism is good already.

    —-
    But yes, there is a problem with the rules:) And it is what people here are discussing.

  17. J.W. Brewer says

    @drasvi – but to the extent there are “rules” in historical changes in syntax in terms of statistical tendencies where a>b is substantially more common than b>a, that tendency is presumably driven at least in part by social/cultural/environmental factors regarding how a language gets used. Assuming those factors were more or less the same 30,000 years ago as they seem to me today seems a much more rickety assumption than e.g. assuming that the physics affecting how planets orbit around stars are the same today as they were 30 million (or more) years ago. Put another way, if you don’t have a pretty good theory of the mechanism that explains *why* the statistical tendency you observe is the case, it becomes a lot harder to know how plausible it is to assume that the same mechanism would have operated to produce essentially the same results in prehistoric times.

    If the likely gap between a toy model and the otherwise-inaccessible historical phenomena it is trying to approximate is too great, due to limitations of data and/or imprecision of methodology, then by all means have fun with your toy model if that’s your idea of a hobby but don’t try to fool people into thinking you’re doing actual scientific research as opposed to speculative fiction.

  18. drasvi: The sentence “Recent work in comparative linguistics suggests that all, or almost all, attested human languages may derive from a single earlier language” refers to the Proto-World papers then in vogue. A small group of Proto-Worlders wrote those papers and approved of them. Everyone else thought they were bad and useless beyond redemption. In their respective times, Phlogiston and Lamarckism were worthier of consideration than 1990s Proto-World studies.

  19. by all means have fun with your toy model if that’s your idea of a hobby but don’t try to fool people into thinking you’re doing actual scientific research as opposed to speculative fiction.

    Quoted for emphasis. This is my problem with a great deal of speculative language study — its proponents claim not to be just having fun (which I approve of!) but adding to the stock of human knowledge, which is pernicious nonsense.

  20. A question:

    If one believes that the Out Of Africa hypothesis is right (all modern humans stem from a single group of Homo sapiens who emigrated from Africa 2,000 generations ago and spread throughout the world over thousands of years), does one not have to believe too that all human languages derive in one way or another from the form of oral communication (be it a set grunts or something more sophisticated) of that single group?

    That is, if all human beings descend in all their lines from that single group, all those lines are the conduits through which all spoken languages must have evolved from the form of oral communication of that group.

    I have no opinion about the matter.

  21. Martin, sure, it’s possible that all existing languages have a common ancestor spoken 100,000 years ago or more; at least I don’t know of a reason why it shouldn’t be so. But there’s no way to demonstrate it using the evidence of present-day languages.

    It’s perfectly possible that William Shakespeare sneezed on the morning of his 12th birthday, but there’s no evidence to show that that really happened or not.

  22. jack morava says

    A quick reconnaissance via my Tardis reminds me that the oldest Denisovans spoke[?] a whistled language which they created by imitating birdsong, and which are ancestral to the tonal languages of southeast asia. Neither they (nor the Orangs) seem to have had any notion of time or tenses.

  23. For historical changes, knowledge of the state before and the state after is the best evidence, but the next best thing is phylogenetic analysis. If we know that 3 languages with word orders A, A, and B have affinity (established based not on the word order) [A [A B]] then it is reasonable to suggest that the change was in the direction A>B. This assumes that the word order changes are slower or on the scale of language separation, ignores contacts and other obvious disclaimers. I was surprised to see no phylogenies offered for the support in the PNAS paper.

  24. all modern humans stem from a single group of Homo sapiens who emigrated from Africa 2,000 generations ago

    Obviously, not those who remained in Africa. And why should it be a single group? If, for example, the most recent expansion out of Africa happened during a window of couple thousand years it would include a lot of room for language evolution in the process of expansion.

  25. OP Tipping says

    Thanks all for your answers. The paper does have a lot of red flags: Dene-Caucasian and Nostrato-Amerind seem almost parodic but are treated gravely by these authors.
    I was curious about the basic premise nonetheless since it appeared slightly counterintuitive from a layman’s network analysis. If there’s a station that trains can only leave but never arrive at, you expect that station to be quickly depleted of trains.

  26. J.W. Brewer says

    Estimates of how many tens of thousands of years ago language emerged in humans vary wildly, and are not (even assuming the range is instructive versus it all just being a bunch of meaningless guesses) consistently either before or after the current best-estimated timing of the split-off of the ancestors of the anatomically-modern Rest-of-the-World humans from the ancestors of the anatomically-modern Stayed-In-Africa humans. So there are at least four possibiliities.

    A. Monogenesis before the split.
    B. Monogenesis after the split with subsequent cultural spread of so useful an innovation to the other already-genetically-and-geographically-separated groups. (Plus there may have been some earlier splits within Africa that you would have to reconsider A v B for.)
    C. Polygenesis with multiple Ur-Sprachen having currently-attested descendants.
    D. Polygenesis with only one of the several Ur-Sprachen having currently-attested descendants.

    I don’t see any non-speculative way to pick favorites among the four without a lot more data than we currently have. One unresolved chicken-and-egg issue is I guess whether all the necessary biological/cognitive prerequisites for language-as-we-know it had already developed and then language sprang into being fortuitously to take advantage of them and proved so useful that everyone with the right biology jumped on the language bandwagon or whether language developed before all then-living humans were well-adapted for it and the talkers had so much more reproductive success than the non-talkers that the latter were removed from the gene pool. Although even the latter scenario could have occurred independently in multiple population groups widely separated by geography.

  27. J.W. Brewer says

    I forgot a fifth possibility: monogenesis and universal knowledge of Proto-World followed by Wrathful Dispersion (a development alluded to in somewhat mythopoetic terms in the 11th chapter of Genesis). To cut and paste a comment I made on Language Log some years ago:

    “From a purely scientific standpoint, I should think wrathful dispersionism ought to be appealing because it provides an elegant workaround to the seemingly insoluble monogenesis v. polygenesis problem. I expect that most people find monogenesis more intuitively appealing, but again we hit the problem (if we are wary of crackpottish-sounding proposals) of being unable to reconstruct proto-World or winnow the number of separate reconstructed ancestral languages below, I don’t know, a few dozen even if you’re a lumper rather than splitter. A sudden discontinuity, in which the results of monogenesis are transformed into something functionally equivalent to what would appear to be the results of polygenesis (because the pre-discontinuity Ursprache cannot be reverse-engineered from the post-Babel data), harmonizes the intuition with the data.”

  28. If one believes that the Out Of Africa hypothesis is right (all modern humans stem from a single group of Homo sapiens who emigrated from Africa 2,000 generations ago …

    From archaeological evidence, it’s increasingly looking likely there were pre-Homo Sapiens species (such as Neanderthals) in many places; and that they were not very different in social organisation from H.S., and would have had language, and did interbreed. So proto-Homo Sapiens diffused amongst proto-Neanderthal.

    ‘All bollocks’ is the only sensible conclusion I can see.

  29. @LH, whole mathematics is like this. Mathematicians have fun and add nothing to knowledge:)
    Nothing that is not already contained in what they know.

    @J.W. Brewer, I just meant that to start speaking about PW you don’t even need to want to reconstruct it.

    In this case the Shakespeare argument works for the authors, not against. They wrote a work (good or bad) about observed direction of changes. They have to adress this:

    Tai’s and Faarlund’s hypothesis that SOV arises in a language only due to contact with other SOV languages is interesting, but clearly overstated…. If new SOV languages arose only from contact with older SOV languages, then where did theprior SOV languages come from; and if they too are assumed tobe due to contact with SOV languages, then how did the veryfirst SOV language come about?”

  30. There’s a theory that Celtic languages acquired VSO order from contact with a substrate language of the Afro-Asiatic type (e.g. Arabic, Hebrew, but more likely something related to Amazigh/Berber). There are other structural similarities as well.

    But how likely is it that a dominant language would pick up just structure from a substrate language and not any vocabulary? And there’s a lot of structure that didn’t get picked up too.

    I suppose the idea is that the pre-IE population of western Europe spoke an Afro-Asiatic language. It seems difficult to confirm.

  31. There are other structural similarities as well.
    Construct state

  32. J.W. Brewer says

    The subhed (not necessarily by the author herself) to this did-Neanderthals-have-language article is “It’s surprisingly hard to prove one way or the other.” To which the obvious rejoinder is, Who is it who finds this surprising? What are the unexamined presuppositions behind the assumption apparently held by some that because we can frame the question we ought to be able to answer it?

    https://www.discovermagazine.com/planet-earth/could-neanderthals-speak-the-ongoing-debate-over-neanderthal-language

  33. Yep. History doesn’t owe us anything.

  34. jack morava says

    @dravsi, You are free to have your opinion, but it is perhaps uninformed.

    I am reminded that I’ve read the claim that Socrates (according to Plato, toward the end of the Republic) said (something like) `Mathematicians are people who dream that they are awake’. I have looked at various translations and there does seem to be some discussion of whether or not mathematics is useful for anything (war is apparently an uncontested place where it’s said to be useful) but, given the anthropological/cultural distance, it’s hard for me to tell if the claim is plausible. I reckon there are Hatters who are better informed than me. As they say in Private Eye, I think we should be told.

  35. uninformed
    jack morava, all right. Perhaps some mathematicians do not have fun.:)

  36. David L. Gold says

    @ maidhc. “There’s a theory that Celtic languages acquired VSO order from contact with a substrate language of the Afro-Asiatic type (e.g. Arabic, Hebrew, but more likely something related to Amazigh/Berber).”

    @ drasvi. “There are other structural similarities as well. Construct state.”

    See “Punic in Proto-Germanic (http://languagehat.com/?s=vennemann).

    This is an important point in maidhc’s comment: “But how likely is it that a dominant language would pick up just structure from a substrate language and not any vocabulary? And there’s a lot of structure that didn’t get picked up too.”

    Theo Vennemann and Robert Mailhammer’s examples of Punic in Proto-Germanic vocabulary look more like chance similarities than borrowings. Had they found patterned differences, their case for borrowings would be entertainable.

  37. jack morava, I just mean,
    – a work does not have to make true claims regarding the material world to be useful. Any work that leads to something good is worthy.

    Accordingly, I find it unreasonable to call a work bad based on that “it makes a claim, that is unlikely to be true”. I am not defending this particular work, though and I do not mean that one can’t criticize scientific works. Only that this particular criterion is wrong. Some works are worthy for this reason, other propose new methods, other ask right questions and others maybe learn more about the obstacles we are facing. Some maybe even provoke useful criticism
    All of this can be called “speculative” (in the sense: no such claims).

    Mathematics is exactly an occupation that does not produce claims about the material world. It provides other sciences with tools, it plays with them if you like (it inspires others and is inspired by others), but it does not make such claims.

  38. Jack Morava: the quote is, “Geometry and the studies that accompany it, are, as we see, dreaming about being, but the clear waking vision of it is impossible for them as long as they leave the assumptions which they employ undisturbed and cannot give any account of them.” (Republic, 7.533b, in Shorey’s translation, here). The quote sure got mangled. It seems to have come out this way in an essay by Wilkinson in the New Yorker from a few months ago. The less sense it makes, the more poetical and quotable it is.

  39. I must admit, I do sometimes giggle when I find various similarities between the two peoples (starting from Celtic brooch and yes, the construct state). I even wanted to compile for myself a list of Celtic words in North Africa and vice versa, but keep forgetting them.

    But it is just because I happen to have some familiarity with both regions.

    —-
    It would be crazy to assume that Greeks and Phoenicean etc. were the only people to cross the sea. They did not invent boats. Of course there were peoples who lived on both shores (not necessarily Berbers). The population fo the African coast, in turn, does not have to be Berber. What one needs is either large European-African language area (and then you’ll have to explain why Celtic) or just an Atalantic area extended to Morocco. An Atlantic area in Europe is archaeologically plausible, Morocco-Spain is geographically plausible. The next question is what genetics says.

    But this all is “a priori”. You need good linguiistic evidence, of course.

  40. PlasticPaddy says

    @Y
    Why we should put ourselves out of our way to do anything for posterity, for what has posterity ever done for us?
    Sir Boyle Roche
    See (for this and other quotes from this clever man)
    https://en.m.wikiquote.org/wiki/Boyle_Roche

  41. jack morava says

    Thanks to both Y and drasvi. I like the mangled version better; I will ask Chuang Tzu about it, perhaps cycling the original through Chinese would produce to something more to my liking.

    My `material world’ may differ from drasvi’s. The mistranslation caught my eye because I wonder what dreaming and waking might mean in remoter cultures; I am very fond of Jarry’s Dr Faustroll and his understanding of scientific thought.

    (also, apologies for the metathesis)

  42. J.W. Brewer says

    Re: Ruhlen’s co-author the late Murray Gell-Mann, here’s an interesting anecdote from an interview he gave approximately 60 years after the point in his high school career it describes. The linguistics profession may have dodged a bullet?

    Uncharacteristically, I discussed my application to Yale with my father, who asked, “What were you thinking of putting down?” I said, “Whatever would be appropriate for archaeology or linguistics, or both, because those are the things I’m most enthusiastic about. I’m also interested in natural history and exploration.”

    He said, “You’ll starve!”

    After all, this was 1944 and his experiences with the Depression were still quite fresh in his mind; we were still living in genteel poverty. He could have quit his job as the vault custodian in a bank and taken a position during the war that would have utilized his talents — his skill in mathematics, for example — but he didn’t want to take the risk of changing jobs. He felt that after the war he would regret it, so he stayed where he was. This meant that we really didn’t have any spare money at all.

    I asked him, “What would you suggest?”

    He mentioned engineering, to which I replied, “I’d rather starve. If I designed anything it would fall apart.” And sure enough when I took an aptitude test a year later I was advised to take up nearly anything but engineering.

    Then my father suggested, “Why don’t we compromise — on physics?”

  43. @JWB thank you for that article. I see they quote Chomsky [et al] on one side of the debate, so that puts me firmly on the opposite side.

    On a tangent, I was brought up short by their use of intransitive ‘preserve’:

    what do preserve are language proxies — bones, artifacts and DNA …
    … requiring organs like the tongue, diaphragm and brain that rarely preserve.

    My idiolect seems to have ‘preserve’ only as transitive. And the dictionaries I checked agreed. Is this some archaeologists’ special usage?

  44. I suppose the idea is that the pre-IE population of western Europe spoke an Afro-Asiatic language. It seems difficult to confirm.

    https://upload.wikimedia.org/wikipedia/commons/8/81/Cardial_map.png

    This is how the agriculture started in Western Europe circa 6500 BC.

    Afro-Asiatic would be quite likely language to speak for the people of European Neolithic given the original area from where they started their dispersal.

  45. I’d also note that North Africa got agriculture from the same source at the same time. Fortunately languages of the first farmers and pastoralists of North Africa survived (they even left some of the earliest texts in human history) and we know what language family they belonged to.

    Surprisingly, it’s Afro-Asiatic (Berber and Egyptian) too.

  46. Guus Kroonen thinks there is a layer of agricultural terms in the European IE languages, which might come from a Hattic substrate.

  47. @AntC: That usage looks borderline to me, something In would never produce but not wildly wrong either. (It sounds like something H. P. Lovecraft might have used, but there is no sign of intransitive preserve in At the Mountains of Madness or “The Shadow Out of Time.”) The OED has one non-obsolete, normally intransitive sense:

    to continue without physical or chemical change; to stay in good, wholesome, or intact condition, to keep,

    for which the most recent (1955) cite is specifically paleontological:

    Asteroidea… preserve poorly as fossils because of the lack of a solid endoskeleton.

    However, it also mentions that the most recent major sense,

    to keep (game, or an area containing game) undisturbed for private hunting, shooting, or fishing…

    is occasionally used intransitively. The only two intransitive citations are from Bulwer-Lytton and Trollope.

  48. This map of word order around the world has a few salient observations: word order is stable: almost all zillion Bantu languages are SVO. Almost all zillion Indo-Iranian languages are SOV. Papuan is SOV, Austronesian rarely is, etc. So the claims that one word order is less stable than another is suspect (unless you assume bogus phylogenies, which are the authors’ stock-in-trade.)

    Also, SOV is rare in Europe: Basque, and… Sorbian!

  49. @Y, they did not make such a claim.

  50. David Marjanović says

    “That doesn’t preserve well” is unremarkable in paleontology.

    What this paper is doing is getting ahead of itself. Once you have a phylogeny, you can read very interesting things from it. But if the phylogeny is wrong, so are the conclusions (except maybe by chance). Even in biology, where phylogenetics has made a lot more progress than in linguistics, such papers still come out routinely in little subfields where not so much progress has been made. Mine for instance. A few weeks ago there was a paper tracing how certain characters evolved – on a phylogeny that was just a summary, made in 2007, of the phylogenetic trees that had been published up to 2006 or so. Often, people want to do studies that simply cannot yet be done, so they make a few assumptions, do the study anyway, and actually get it published.

    I’m morally certain it’s bullshit, like all statements about “the first human language,” but I’ll post it at LH and see what people have to say.

    There is no statement there about the first human language; it’s about the presumably much later last common ancestor of all languages that are known today. This means there are no claims about what is basic to the human brain or whatever.

    “Is it possible to have an ergative/absolutive, topic-comment language?” None of the three responses gave an actual example of a natural language fitting the bill

    Topic-comment word order is noticeably underresearched, though. There may be lots of examples that simply haven’t been published on.

    went back historically to finite structures arising from copula/light-verb followed by an object + verbal noun compound. (Mandinka actually still has noun incorporation.)

    Participles getting reinterpreted as finite verbs seems to be pretty common, just underresearched; and it seems like a good source of SOV languages.

    “I fruit-picked”

    Already happened: “I bartended”.

    But I really wouldn’t expect concrete nouns to cause SOV order.

    Dene-Caucasian and Nostrato-Amerind seem almost parodic but are treated gravely by these authors.

    Dene-Caucasian has a lot more evidence behind it than Nostrato-Amerind, which has practically none whatsoever. It is precisely this lack of a critical approach to the source phylogenies that is a giant red flag: to complete their tree, the authors needed some phylogenetic hypothesis, any phylogenetic hypothesis, so they were willing to use any that didn’t have published alternatives rather than concluding their study just couldn’t be done yet.

    If there’s a station that trains can only leave but never arrive at, you expect that station to be quickly depleted of trains.

    Yes, and the paper mentions contact as the only source of new SOV languages – apparently without bothering to go into any detail. And that’s terrible.

    Wrathful Dispersion

    The problem with this argument is that it takes our current state of ignorance and declares it knowledge. It’s just like “the Comparative Method can’t reconstruct back farther than 6000 years, when it suddenly hits a wall” because that happens to be the age of Indo-European. (…Afro-Asiatic is twice that old. Oops.)

    It is not that people have tried to do phylogenetics on language groupings the size of Nostratic and found it impossible. They’ve barely begun to try.

    But how likely is it that a dominant language would pick up just structure from a substrate language and not any vocabulary?

    A lot more likely than the opposite. Structural features are more easily picked up from substrates, words from superstrates.

    What one needs is either large European-African language area (and then you’ll have to explain why Celtic) or just an Atlantic area extended to Morocco. An Atlantic area in Europe is archaeologically plausible, Morocco-Spain is geographically plausible. The next question is what genetics says.

    Genetics says no. That may not have been clear ten years ago, but it is now.

  51. Afro-Asiatic would be quite likely

    But “likely” is very different from “confirm.” All sorts of hypotheses can seem likely, especially to their loving parents.

  52. @ David Marjanović says: ” But how likely is it that a dominant language would pick up just structure from a substrate language and not any vocabulary?

    “A lot more likely than the opposite. Structural features are more easily picked up from substrates, words from superstrates.”

    Could you please give some examples of dominant languages that have picked up just structure from a substrate and not any vocabulary?

  53. “But how likely is it that a dominant language would pick up just structure from a substrate language and not any vocabulary?

    But we don’t need the substrate langauge to be specifically Afro-Asiatic, or, within it, specifically Berber branch.
    It is enough for it to be subject to the same areal influences (or be related).

    Or if you mean not “identifiable” borrowings, but any sort of them, do we consider Celtic virgin?

  54. SFReader says

    In most cases, speakers of dominant languages are not colonial settlers with dominant position in the society, but rather opposite – colonized people forced to switch to the language of their oppressors.

    It is very logical for them to speak the dominant language using a lot of syntactic structures from their own language.

    Case in point:

    “I have the car fixed.” (Tá an carr deisithe agam.)
    “I have my breakfast eaten.” (Tá mo bhricfeasta ite agam.)

    Irish English borrowed many Irish constructions including even the Irish perfect tense (but there is relatively little Irish vocabulary in Irish English).

  55. How many small groups, each speaking a separate ur-language, are required to settle the Americas circa 15,000 years ago, to create the kind of diversity that the currently-ascendant historical linguist specialists claim for this part of the world, after subtracting out the Athabascans and the Eskimo-Inuits? (These being non-controversial.)

    Either Beringia was linguistically complex during a crucial two or three millennia, or all that splitting happened after the event.

    The first and second singulars of IE feature a lot of m’s and t’s, and Ruhlen argues that the first and second singulars of Amerindian languages feature a lot of n’s and m’s. He may have said some ignorant things, but is this one of them?

    Ruhlen argues that Greenberg’s work on African languages was decried by entrenched ‘splitters’ who dominated that region, until grudgingly winning recognition, and that something similar is true with respect to those who work with the indigenous languages of the Americas, a sort of defensive territorialism.

    Am I misunderstanding the controversy? Perhaps people think there was a proto-Amerind language, but deny that much can be done to reconstruct it?

  56. @SFReader “Irish English borrowed many Irish constructions including even the Irish perfect tense (but there is relatively little Irish vocabulary in Irish English).”

    The latest edition (2013) of Terence Patrick Dolan’s Dictionary of Hiberno-English: The Irish Use of English has 292 pages. Discounting the preliminaries (= American English front matter), that would stil leave a significant number of pages devoted to Irish English vocabulary of Irish origin.

    Can you estimate the number of Irish English constructions of Irish origin after discounting those, if any, limited to the English of native speakers of Irish?

  57. David Eddyshaw says

    Ruhlen argues that Greenberg’s work on African languages was decried by entrenched ‘splitters’ who dominated that region, until grudgingly winning recognition,

    That’s pretty much the opposite of what actually happened: where Greenberg’s African groupings are uncontroversial, he was largely piggybacking on the previous classifications drawn up by others. Elsewhere, his phyla look more and more shaky as more and better data are accumulated. We splitters are actually slowly prevailing. Greenberg’s chief innovation was Nilo-Saharan; although Gert Dimmendaal (no lumper, he) thinks it’s valid, you’re talking positively Amerind levels of lack of evidence there, really.

    Greenberg’s Africa groupings are, overall, much less controversial than the American ones in part because at least (most of) Niger-Congo and Afro-Asiatic are real, and really have spread far and wide in Africa. But I also suspect that controversy came later with the African classification because there was a lot less high-quality primary data on African languages at that time and because actual real comparative work was slow to get started (with the shining exception of Bantu.) It’s still a pretty niche sport.

    With regard to the great linguistic diversity of the Americas: there’s no reason to suppose that the first inhabitants were not already highly diverse linguistically before they got to America.

    With respect to pronouns and m’s and n’s: (a) this is in reality very far from universal in the Americas (b) coincidence is much more likely than you might at first suppose, because only a narrow range of very unmarked consonants typically appear in personal pronouns at all (c) there may even be universals at work: personal pronouns are not altogether unlike the “mama” and “papa” words that turn up in so many quite unrelated languages. The Kusaal word for “me” is m and “you” plural is ya* but only those who believe that all human languages just must all be related, evidence or no, think that Kusaal is related to English.

    * Something of an illusion, in the sense that the historical form was actually *ɲa.

  58. SFReader says

    there’s no reason to suppose that the first inhabitants were not already highly diverse linguistically before they got to America.

    It doesn’t pass the plausibility test.

    Chukotka peninsula – the closest thing to America after Beringia submerged – had only two language families – Eskimo-Aleut and Chukotko-Kamchatkan.

    There is simply no space for much diversity in this sparsely populated area.

    I doubt Beringia was any better.

  59. David Eddyshaw says

    How about New Guinea, or (to a less spectacular degree) the Caucasus?

    Chukotka has two surviving language families. We’ve no idea what it was like back in the day when Beringia was a thing.

    (Also, what Hat is about to say …)

  60. There is simply no space for much diversity in this sparsely populated area.

    What do you mean? Why couldn’t various language groups have passed across over the millennia, leaving no current trace?

    (Also, what DE just said …)

  61. There is no statement there about the first human language; it’s about the presumably much later last common ancestor of all languages that are known today. This means there are no claims about what is basic to the human brain or whatever.

    They have to answer the train question (“If there’s a station that trains can only leave but never arrive at, you expect that station to be quickly depleted of trains.“). The answer to it will necessarily include a reference to the first human language.

    The answer is: “why do you think it is impossible? If SOV is stable enough and the initial state is SOV, then we will have the current distribution.” And then: “But why the initial state includes SOV if transition to SOV is impossible?”And why not? We can’t generalize our rule to the stage of formation of human langauge, we know nothing about it.

    It is not how they put it, though.

  62. How about New Guinea

    New Guinea is agricultural, it had millions of inhabitants, of course there is a lot of diversity.

    In Chukotka we are talking about maybe 10 thousand people, average population density of about 1 person per 100 square kilometers. And that’s after the Chukchi adopted reindeer herding.

    Extrapolate this to Beringia 15 thousand years ago.

    One or two language families, four or five languages in total, that’s the extent of linguistic diversity you are going to have here.

  63. There’s been some recent papers on the Beringian Standstill, with native American haplogroup diversity pointing to genetic differentiation between different groups starting over there already; but I don’t think that would result in especially high diversity”as much as a single language family with a decent time-depth (4000-ish years, but surely tempered by continuing close areal contact) already by the American entry. It’s the latest 10,000 years after loss of mutual contact that has to be responsible for most of the diversity.

    The only sane way Amerind is not a valid language family is if some of the groups got in at a different time and route entirely from the main population, e.g. along the Pacific coast and starting from further south than the Beringian population, or from Europe as per the Solutrean hypothesis. This may not make it a demonstrable language family though; but I think it probably could be eventually (if we get enough field linguists and etymologists working on all the hundreds of languages… Greenbergian eyeballing is not going to work for any of this). Recent work seems to have been making headway e.g. towards Je–Tupi–Carib quite a bit.

    I suspect a lot of opposition to “lumping” is not even towards the relatedness-as-such but at the lazy etymological, reconstruction and subclassification work that this usually goes with (a good satirical example is provided by the common Finno-Hindustani, Pashto-Tsakonian and Turko-Icelandic pronoun systems).

  64. Extrapolate this to Beringia 15 thousand years ago.

    a) Beringia was quite different 15 thousand years ago.

    b) You’re ignoring the point about groups passing through Beringia and not leaving traces behind.

    And in general, extrapolation is a dangerous game.

  65. David Eddyshaw says

    I suspect a lot of opposition to “lumping” is not even towards the relatedness-as-such but at the lazy etymological, reconstruction and subclassification work that this usually goes with

    Very much so, in my own case. I would not (for example) be in the least surprised if Mande really did turn out to be related (at some mind-bending time depth) to Volta-Congo: what I object to is shoddy work pretending to have actually established the relationship. It brings the game into disrepute …

    (Thanks for the link, btw: I have always suspected that Turkish was related to Welsh.)

  66. Oh, and the reason Chukotka currently has two distinct language families is because they are specialized to different ecologic–economic niches: Chukotkan to an inland reindeer econogy, Eskimo varieties to a coastal marine mammal hunter ecolomy (and arguably thirdly: Russian as representing modern industrialized economy). We have traces of Yukaghir as a former competing inland group that has been recently extirpated or driven off (yes, by Koryaks at least as much as Russians). Back in Beringian times it seems that the marine hunter toolkit had not yet developed, so this would also point to just one language family.

    Arctic environments are high-risk and require ongoing investment of effort to uphold existing technologies. Hence indigenous peoples tend towards technological conservativism and do not readily adopt new subsistence strategies entirely. (Already just taiga latitudes are more forgiving in allowing a variety of backup sustenance strategies if a new technology doesn’t work out.) This means that anytime a notable new technology is invented or introduced, it will probably not spread between different cultures but rather the innovative culture will wipe out or assimilate archaic cultures. This is readily demonstrated e.g. by the Inuit expansion into the Canadian high Arctic and Greenland, to an extent also e.g. the Yakut expansion into the Lena basin or the Nenets expansion into European Russia.

  67. a) Beringia was quite different 15 thousand years ago.

    It was even worse obviously since the Beringian population just survived the Last Glacial Maximum.

    Population density ought to have been even lower than in historic Chukotka. Perhaps as low as

    A study has indicated that the genetic imprints of only 70 of all the individuals who settled and traveled the land bridge into North America are visible in modern descendants.

    How much language diversity can 70 individuals support?

    groups passing through Beringia and not leaving traces behind

    Only groups which already were in Beringia were in position to pass through it.

  68. Looking at this question with no preconceived notions: It seems like there is certainly enough time for multiple linguistically unrelated groups to cross over. (Speakers of how many different primary language families have migrated to Europe in, say, the last 1500 years?) On the other hand, the time depth of the migration is enormous and—even more unusually—so is the untenanted area that the Asian settlers eventually overspread. In light of these facts, it does also seem plausible that a single parent language could have given rise to the full diversity of Amerind languages.

  69. traces of Yukaghir as a former competing inland group

    The Yukaghir were Bronze Age newcomers to East Siberia (from the Urals or thereabouts as evidenced by their language). They were probably linked to those strange Bronze Age travelers who roamed from Finland to Alaska and left traces of bronze metallurgy in the most unexpected places (like the Arctic Taimyr peninsula).

  70. @David Eddyshaw: ‘…because only a narrow range of very unmarked consonants typically appear in personal pronouns at all’.

    Interesting, I’d not come across this idea before. Don’t suppose you could point me towards some sources so I could read up on this?

  71. J.W. Brewer says

    A prior lengthy comment got eaten and may not be worth asking hat to try to retrieve. The current genetic-analysis state of the art (which is not the final word …) seems to be that all “Amerinds” (except for Athabaskan speakers who have a minority genetic trace of later admixture from a different late-arriving population) descend from a quite small population group that had lived in Beringia separated geographically and genetically from other Paleo-Siberians for a very long time (like 10 or 15 thousand years) before they began to move southeast to begin populating the Americans. So “Proto-Amerind” (assuming the original folks who separated from other SIberians all had a common language, which is obviously not certain) could have started diverging into daughter languages within Beringia and already been difficult-to-impossible to reconstruct with our current tools before the latter migration even began.

  72. I have retrieved it and will post it below.

  73. J.W. Brewer says

    You can find recent genetic analyses arguing (I don’t know how strong the counterarguments may be or may become as and when more data comes in) that the “founding population” from whom all “Amerinds”* descended was 1) to the number of totally unrelated languages they would have spoken. But I *think* that estimate is for when the remote ancestors separated geographically (and thus genetically) from the larger Siberian population of the day and settled in Beringia, which may have been 10 or 15 millennia before they left Beringia and started moving south en route to Tierra del Fuego. So if they were spread out enough during the extremely lengthy Beringian interlude, the breakup of “proto-Amerind,” even if it were a single language, into mutually incomprehensible daughter languages could have started 30+ millennia ago, making reconstruction even more of a challenge than a breakup 15ish millenia ago.

    Beyond that: (a) it seems somewhat implausible that those proto-Amerinds separating out from other Siberians 30+ millennia ago didn’t have language by then, but (b) it also seems plausible that whatever language(s) they had in common with their pre-Beringian neighbors who didn’t move to Beringia have no currently extant descendants.

    Obviously who as between lumpers and splitters should bear the burden of proof given uncertain and incomplete data is not a question with an obvious answer, but maybe an easier example is South America where the genetics suggests divergence/isolation from the Amerinds farther north as of maybe 12 millennia ago. Is “Is 12 millennia enough time for a single Proto-South-Amerind language or at least set of identifiably related languages to have unreconstructably (according to most current scholars) diverged into the range of languages and “families” attested throughout South America as of the time of first European contact?” a question that we currently have the tools to answer?

    That it could have happened that way doesn’t mean it did happen that way, of course, but can splitters show that it couldn’t have happened that way (or at least is relatively unlikely to have happened that way), or is the key strength of the splitter position just the notion that beyond a certain time depth anything is possible and nothing can be ruled out, and thus no particular hypothesis is worthy of any more credence than its rivals?

    *The genetic analysis on current consensus seems to be that most speakers of Na-Dene languages are largely descended from those same folks but with a notable minority admixture of ancestry from much later arrivals who do not seem to have affected the genetics of most societies whose languages fall into Greenberg’s “Amerind” lump.

  74. is the key strength of the splitter position just the notion that beyond a certain time depth anything is possible and nothing can be ruled out, and thus no particular hypothesis is worthy of any more credence than its rivals?

    Yes. I simply don’t see the point in trying to imagine various possibilities, all more or less plausible, in the absence of hard evidence. It’s like looking at a film of people having an argument and trying to imagine what led them to where they are at the time of the film — a useful exercise for a novelist, pointless for anyone else.

  75. J.W. Brewer says

    As a result of whatever went awry when I was trying to edit the comment hat retrieved, a chunk of what should have been in the second-through-fourth lines was omitted, resulting in nonsense. Mentally swap in something like “the “founding population” from whom all “Amerinds”* descended was quite small (maybe as few as 250), which suggests a fairly low upper bound  (even if maybe >1) to the number of totally unrelated languages they would have spoken.”

  76. SF wrote
    >languages of the first farmers and pastoralists of North Africa survived… It’s Afro-Asiatic (Berber and Egyptian)

    While you may be right, I think it would need to be proven that because these were found there in early historical times, that they descend from the languages present in North Africa several thousand years before.

  77. >>What one needs is either large European-African language area (and then you’ll have to explain why Celtic) or just an Atlantic area extended to Morocco.

    David M replied:
    >Genetics says no

    Not challenging, just wondering – does genetics say no merely to a late prehistoric area of shared population or descent? Or is there sufficient data to assess connections around or even just before the first agriculturalists reach Iberia and North Africa.

  78. >historic Chukotka

    This is challenging to my detailed understanding of northeast Asian geography gleaned from careful and long-term study of the Risk board.

  79. David Eddyshaw says

    Don’t suppose you could point me towards some sources so I could read up on this?

    My immediate source for this was the über-splitter Lyle Campbell*. Impressionistically it does seem to be widely borne out (admittedly this is hardly a rigorous demonstration.)

    Semitic languages, for example, don’t have pharyngealised/glottalised consonants in pronouns (or indeed in flexions.)

    It’s by no means an absolute rule, though. Khwe, for example has lateral clicks in its plural personal pronouns, and you don’t get much markeder than clicks.

    It’s just occurred to me that Oti-Volta languages feature a kind of counterexample too: the voiceless /f/ does turn up in full-word stems, but it’s very difficult to find examples reconstructable to the protolanguage; on the other hand, the stressed second person singular pronoun was *fi, and the sg suffix of one whole noun class was *fu.

    The human-class 3rd sg pronoun also seems to have been *ŋ͡mi, and it’s hard to argue that labiovelars aren’t marked.

    But it doesn’t need to be an exceptionless principle to increase the risk of pure coincidence in pronoun systems, just a powerful tendency; it does seem valid at that level.

    * The splitter that even other splitters fear …

  80. J.W. Brewer wrote

    >Beyond that: (a) it seems somewhat implausible that those proto-Amerinds separating out from other Siberians 30+ millennia ago didn’t have language by then, but (b) it also seems plausible that whatever language(s) they had in common with their pre-Beringian neighbors who didn’t move to Beringia have no currently extant descendants.

    Can you explain your idea in b? Did a different language reach them through later population influx, giving rise to all descendants? The decision by some group to create a ConLang, which was wildly successful? Or are you implying a second virgin birth of language itself? That seems to relate to your point a, but I find point a well beyond “somewhat implausible”. Or what? I’m not finding any of the alternatives that I understand very reasonable.

    I do assume that all languages are related, just well beyond our capacity to trace, ever. I guess the alternative might be that modern humans in Africa, Neanderthals and/or Denisovans developed language independently (or the same could be true of discreet African populations much earlier). That is a situation in which I would expect that one or another of the groupings would have overwhelmed and submerged the others, so I would still expect that all extant languages are related, rather than that some descend from the Neanderthals or the Denisovans.

  81. Is the key strength of the splitter position just the notion that beyond a certain time depth anything is possible and nothing can be ruled out, and thus no particular hypothesis is worthy of any more credence than its rivals?
    I’d rather put it this way – with the time depth involved and the current status of reconstruction, it cannot be shown that all “Amerind” languages have one and the same ancestor. Even if, like it starts to look like, the “Amerind” speakers are descendants from a small group of individuals making it unlikely that they spoke more than one language, the hypothesis that there was one ancestor language cannot be confirmed by the methods of historical linguistics, and that language or features of it cannot (yet?) be successfully reconstructed.

  82. the stressed second person singular pronoun was *fi, and the sg suffix of one whole noun class was *fu.

    Fifi eats fufu.

  83. David Eddyshaw says

    There inevitably comes a point when the changes over the millennia are so great that rigorous application of the comparative method will inevitably fail – the information you need just isn’t there any more.

    Mind you, it doesn’t follow that information loss everywhere proceeds at the same rate (even in different domains within a single language, let alone between different languages), so there is no time depth set in advance as to when the task becomes hopeless; moreover, this suggests a gradual fading from view of plausible protolanguages rather than a sudden cut-off. There are bound to be perfectly justifiable differences of opinion even among sensible careful scholars.

    There may turn out eventually to be other ways of investigating the deep prehistory of languages, too. Michael Fortescue tried to do this with language resemblances across the Bering strait, basically arguing that certain bundles of typological features were highly conserved over time and could thus be used to demonstrate language relationships rigorously. I wasn’t persuaded myself, but it was an interesting and serious effort. I think Johanna Nichols has done something along these lines, too.

    Fifi eats fufu.

    It seems very odd that /f/ apparently only occurred in function words and affixes, especially as /v/ is readily reconstructable to Proto-Oti-Volta and is not so limited. I don’t understand it. It probably needs to be investigated at the level of the origin of the entire family within Gur-Adamawa, which is well beyond me at the moment. (The b/v contrast probably derives from ordinary /b/ versus implosive, judging by the very few cognates I can find in the Grusi languages.)

  84. beyond a certain time depth anything is possible

    Beyond enough time anything is possible within geographic constraints since languages cannot move without people & people didn’t have world maps in the stone age.

    Within any one continent anything might go pretty much; if you look at sub-continental proposals like of Hokan or Penutian or Nostratic purely on a map (or even just the spread of a few established families like Turkic or Austroasiatic), they certainly look fairly random. But the major intercontinental expansions set reasonably strong boundary conditions. It might be possible that e.g. actually the closest relative of Aymara is Haida, but it would not make a single shred of sense for the closest relative of Aymara to be Mande or Sumerian or Pama-Nyungan, and even e.g. Iroquioan might not make more than half a shred.

    This is kind of the same issue as where we are much more certain of the validity of Indo-European or Uralic or Afrasian than whatever their subgrouping might be.

    — Marked phonemes in pronouns: worst example I know of, the Tremjugan Khanty 3rd person pronouns are singular /ɬɪɣʷ/, dual /ɬin/, plural /ɬɪɣ/ where I’d argue only two phonemes are not quite marked (and then in the most grammatically marked form too). (Tamed down to e.g. 3P /tʊw/, /ten/, /tɪɣ/ in some Northern Khanty varieties or /mæ/, /min/, /mɪŋ/ in Tremjugan 1st person.)

  85. D J EDDYSHAW says

    Now I think of it, I should probably have limited my statement to first and second person pronouns, given that third persons are quite often coopted demonstratives or deictics or whatever. (Not that that rules out all the exceptions, of course.)

  86. Trond Engen says

    Add me to the list of people who don’t believe in many unrelated languages in the Beringian refuge. And even if there were, they would have been sprachbundled into undisentangability.

  87. J.W. Brewer says

    @Ryan: I may not have expressed myself clearly. My notion is that at the moment (maybe a definitive moment only in hindsight) when the proto-Beringians moved out of convenient communication-and-intermarriage range of their former Paleo-Siberian neighbors, there was presumably one or more languages or at least language family/ies common to both sides of the just-separated population. On the migrating side, the language(s) that went to Beringia turned after 30,000+ years into the Amerind languages. On the staying-behind side, the language(s) could quite plausibly have gone extinct w/o living descendants (the point I was making in the earlier post), but could also quite plausibly (after 30,000+ years) have evolved into something we can’t adequately match up with anything in an Amerind language family to show common descent from Proto-Pre-Beringian.

    Whether or not the Dene-Yeneseian proposal is thought convincing, it presupposes an ancestral split between the Siberian half and the North American half that plausibly (given genetic data etc.) would have been significantly more recent than that although still at enough time depth to make reconstruction challenging.

  88. They are brutal with Mande and Niger-Kordofanian.

    I just read this part…

  89. John Cowan says

    (…Afro-Asiatic is twice that old. Oops.)

    Yes, but the evidence for AA is nothing like the evidence for IE or Uralic or PA: we have not reconstructed AA, or rather we have, and the result is two dictionaries that agree on approximately nothing. Instead, it’s the weirdness of AA, with its interdigitated morphology and all, that makes us sure it’s a legitimate family: all of that could not reasonably have arisen twice.

    Also, there is nothing like Omotic, a group of living languages generally thought to be AA but for which a respected minority opinion is that it is not AA at all, in any of the other families I mentioned.

  90. Marked phonemes in pronouns: /ħ/ is stable in the free 1Pl pronoun across Semitic (Arabic naħnu) pretty much everywhere that /ħ/ remains phonemic in the language as a whole. For pronominal markers, Kabyle and many other Berber varieties have 1Sg -ɣ and 2Sg t-…-ḍ (phonetically [-ʁ], [θ-…-ðˤ] respectively. Not that single examples are of any great interest for statistical generalizations, but I’d want to see whether they’ve controlled properly for relatedness. That’s particularly tricky in this case: the whole point of the claim is to argue against the otherwise plausible hypothesis that pronouns tend to be so conservative they remain similar even after genetic relationship becomes otherwise undetectable, so there’s a risk of falling into circularity.

  91. J.W. Brewer says

    Let me say that Hans makes a somewhat important point and distinction. We may in a certain situation be reasonably confident based on non-linguistic data (evidence about who did and didn’t live where when based on genetics and archeology — all going to JP’s point re geographical constraints) that the speakers of such-and-such wide range of languages as of 1492 all descended from the same small group of ancestors and it is extremely likely (because of the same non-linguistic data suggesting a long stretch of geographical isolation) that all of their languages descended from a single proto-language spoken by those ancestors umpteen thousand years earlier. But our confidence in that conclusion may not itself come from linguistic data analyzed via linguistic methodology and that data analyzed via that methodology may not be capable of further bolstering the conclusion. The risk is then that linguists get sucked into purporting to use linguistic data/methodology to purport to bolster the conclusion they accept on independent grounds and don’t even think about what sort of result of linguistic inquiry might contradict the conclusion already accepted on independent grounds. You then end up with an asymmetry where linguistic data that can be made to look consistent with the conclusion is thought to be interesting and meaningful while linguistic data that can’t be made consistent with that conclusion is just dismissed as the sort of random noise inevitable at the time depth involved rather than as undermining the evidence for the conclusion.

  92. David Eddyshaw says

    /ħ/ is stable in the free 1Pl pronoun across Semitic

    Very true. I spoke too soon, partly because I was only thinking of the pharyngealised/glottalised stops. But I can’t deny that /ħ/ is (yet another) pretty good counterexample. I was WRONG …

    I do think Campbell is on to something with this, though, despite the fact that we seem to be coming up with a disturbing number of exceptions. On one level, though, it would hardly be surprising if very-high-frequency morphemes tended to favour not-very-marked segments, on grounds of simple efficiency.

    If that actually is the important factor, you might also expect languages in which free 1st and 2nd person pronouns aren’t very common (because subject person is marked on the verb, for example) to be more likely to have marked sounds in their free pronouns, and also for free pronouns cross-linguistically to tend to have more marked consonants in them than the corresponding affixes or clitics do.

    All this is potentially testable, too, although you would need to be careful to define your criteria for markedness before you actually started looking at lots of languages and were tempted to start shifting the goalposts. And markedness within one language need to be the same as markedness in a broader context. Your criteria for “free” and “bound” too …

    Your point about controlling for (known) relatedness is clearly very important too. Without assuming the classifications beforehand … I suppose the trick would be to adopt a fairly extreme Splitter position for (as it were) the first pass through the data … not sure if that necessarily avoids circularity though.

    Instead, it’s the weirdness of AA, with its interdigitated morphology and all, that makes us sure it’s a legitimate family: all of that could not reasonably have arisen twice.

    AA was principally what I had in mind in saying that not all parts of a linguistic system necessarily erode at the same rate. There’s precious little vocabulary that can really be reconstructed for AA*, and not even anything like a consensus on the phonology of the protolanguage, but the way the languages work morphologically is so peculiar and distinctive that it beggars belief that they’re not all ultimately related.

    * Roger Blench has an entirely justified paper somewhere pointing out just how, well, bad, the two purported AA reconstruction dictionaries are methodologically. And the evidence that Omotic belongs is incredibly weak.

    https://www.uio.no/studier/emner/hf/iln/LING2110/v07/THEIL%20Is%20Omotic%20Afroasiatic.pdf

  93. David Eddyshaw says

    (That should be:
    “And markedness within one language need not be the same as markedness in a broader context”
    I ran out of editing time while adding more epicycles …)

  94. David Eddyshaw says

    Afro-Asiatic is twice that old

    The time depth of Volta-Congo (which is definitely a valid grouping, though its actual membership is not altogether clearcut) seems likely to be a good bit greater than Indo-European, too (even its hypertrophied twiglet Bantu can hardly be much less than three thousand years old); if it does turn out that all of Atlantic belongs with it (looks like parts of it pretty certainly do, judging by Segerer’s Bijogo grammar) that would give you an even greater time depth. And given that Mande alone is about as diverse as Indo-European, if Mande also was part of it, that would mean that a Greenberg-style Niger-Congo could probably give Afroasiatic a run for its money on timescales. (But it probably isn’t, alas.)

  95. Nichols and Peterson looked in detail at American n-m pronouns, and Zamponi has recently refined those results. Tee Ell Dee Are, they are more common in the Americas than elsewhere, but only in parts of the Americas. That supports some historical connection, but you can’t determine from the data how much is due to inheritance and how much to contact, and the n-m distribution poorly fits Greenberg’s Amerind.

    @David Marjanović: The “6,000 year” limit is not a hard barrier, but a barrier nevertheless. It can be more and it can be less, depending on how many branches you have to compare, how conservative they are, and how well documented they are. But because vocabulary attrition is exponential, it’s much harder to work with a 9,000 y.o. family than a 6,000 y.o. one. At some point there is not enough vocabulary left to establish sound correspondences, even more so if they depend on a variety of phonological environments. So you might have some suggestive similarities, but you won’t be able to confirm them.

    Afro-Asiatic: the compelling similarity of templatic morphology, plus a modicum of plausible shared vocabulary, make it a case similar to Penutian, with its distinctive-to-baroque systems of ablaut. As Marie-Lucie can attest, Penutian is a tough, tough problem, and even less assured than AA. But to the Greenbergians, it’s a sub-sub-branch of Amerind, presumably something that needs just a bit of polish; in the same way that they consider Afro-Asiatic ready to be incorporated into Eurasian/Nostratic.

  96. About Beringia: even if at any one point the area called Beringia could support only a couple of language families, in the thousands of years that it existed, speakers of languages in families as diverse as Japanese, Ainu, Nivkh, Eskimoan, Chukotkan, Yukaghir, Yakut, Ket, etc. etc. could have found their way to it from the Asian mainland. Aside from that, people could have arrived by the coastal route before the land bridge existed, and recent very old dates encourage that line of thought.
    So if any of the languages of the Americas are related, their common ancestor need not have existed at the time and place that Beringia existed, just as the common ancestor of English, Dutch, and Pennsylvania German existed not in eastern North America and not in the 1600s–1700s.

  97. David Eddyshaw says

    Thanks again, Y: very interesting paper.
    On the one hand, it looks like the American n/m thing is no mere figment; on the other, its distribution looks uncomfortably areal and doesn’t at all help Greenberg’s Amerind along.

    I’d like to see Babaev’s paper on presumed Niger-Congo reconstructed pronouns. Unless he means only Volta-Congo really, I’m prepared to be very sniffy about his prospects for reconstructing any such thing. (Interesting that they fling about a date of 10,000 to 12,000 BP for Proto-Niger-Congo. That actually sounds not unreasonable if, unlike me, you buy into the whole Greenberg package. But that’s a bug, not a feature …)

  98. “But how likely is it that a dominant language would pick up just structure from a substrate language and not any vocabulary?

    I didn’t ask that question to cast shade on the idea, but because I don’t know the answer and I wanted to know what the collective expertise would have to say.

    I found SFReader’s response particularly enlightening, so thank you for that.

    Some of the other responses, as well, have caused me to think that the AA substrate hypothesis may not be as far-fetched as I first suspected. Thanks to you all.

    Genetic evidence can be challenging to deal with. For example, Ötzi the Iceman is most closely related to the modern inhabitants of Sardinia, even though he lived in modern Austria. What this might show is that the Old Europeans tended to be pushed off the best land and forced into marginal corners like Sardinia. The interpretation is the tricky part.

  99. David Eddyshaw says

    The idea that Insular Celtic has an Afroasiatic substratum reflected in its syntax looks a lot less compelling than it did once in the light of modern typology: a lot a things correlate highly with VSO word order without needing to assume either relatedness or influence. Things which were once thought to be separate resemblences between AA and IC (and thus all the more compelling evidence for contact when multiplied together) are not fully independent variables.

    (This discussion always reminds me of the bit in Ulysses where Stephen and Bloom are comparing Irish and Hebrew, languages which neither of them really actually know. I seem to remember that we had a discussion of this in which it emerged that Joyce himself was quite keen on the contact/substratum thing. And there’s Shaun and Shem in Finnegans Wake …)

    John McWhorter has long been claiming that English picked up syntax from Brythonic despite there being very little vocabulary borrowed. But he’s Just Wrong.

  100. PlasticPaddy says

    @de
    In Britain (and Ireland) the putative IC speakers seem to have been for centuries in contact with, and eventually replaced, a Neolithic farmer culture which left significant architectural traces. So if the AA-like features seen in IC developed there, it seems to me the development was at least stimulated by this contact. The best alternative would seem to me to be that some tendencies or internal variation in Common Celtic were “taken to an extreme” by an “isolated” population (presumably a “self-isolating” population before settling Britain and Ireland, because why would the British settlers be “isolated” from the Continent but not from the Irish settlers).

  101. David Eddyshaw says

    It’s a perfectly possible scenario, but I don’t think it’s demonstrable.

    We’ve got no way of knowing anything about the languages (surely more than one) of the neolithic farmer culture, unless you assume that they must have been responsible for the VSO word order of Insular Celtic, and that this shows that their languages were VSO. Even assuming that they spoke Afroasiatic languages, we still don’t know if they spoke VSO languages (by far the most Afroasiatic languages are actually not VSO.) It’s all circular.

    The assumption that Proto-AA was VSO (leaving aside that fact that we can’t even reconstruct the basics of its structure in any other respect) seems to be based on Berber, Semitic and Egyptian; but most of the diversity of the whole family is in Cushitic, which is largely SOV. The reconstruction is (inevitably) biased in favour of the most familiar and earliest-documented languages.

    Assuming for the sake of argument that Proto-AA was VSO, remarkably few of its modern descendents remain so, so one would have to conclude that that ordering has not in fact proved very stable. If speakers of AA languages had fetched up in Britain there would have been plenty of opportunities for them to scramble their word order along the way before they got here.

    (Welsh has gone from VSO to SVO and back again over the past thousand years or so, though I must admit that doesn’t really prove anything, except perhaps that Nobody Knows Anything.)

  102. David Eddyshaw says

    almost all zillion Bantu languages are SVO

    I think that you could make a respectable argument that this derived from earlier SOV, from the fact that object agreement prefixes in Bantu come just before the verb, e.g. Swahili ulikivunja “you broke it” (a chair, kiti); the argument being similar to the one that e.g. French je t’aime preserves the SOV order of its parent Latin.

    It’s a bit more complicated than that, though, not least because SOV is rare in the rest of Volta-Congo, which would make SOV unexpected. I remember reading a paper which suggested that the prefix chains of Bantu verbs probably derive historically from auxiliary light verbs with SVO order followed by the lexical verb as a verb-noun/inifinitive/whatever. I’ll see if I can track it down again.

    Ah: Tom Güldemann (of course)

    https://www.researchgate.net/publication/300471822

  103. The assumption that Proto-AA was VSO (leaving aside that fact that we can’t even reconstruct the basics of its structure in any other respect) seems to be based on Berber, Semitic and Egyptian

    In fairness, VSO order is fairly widespread in Chadic too, which gives Cushitic a run for its money in terms of diversity. Also, SOV order in East Africa looks very areal, cross-cutring unrelated languages, whereas VSO order in Chadic generally looks very different from neighbouring languages. Still, I would agree that reconstructing word order for AA when we can’t even manage enough words to write a fable seems like overconfidence.

  104. I think that you could make a respectable argument that this derived from earlier SOV, from the fact that object agreement prefixes in Bantu come just before the verb,

    Here you support the authors.

    French je t’aime

    And Russian, but: as long as the reason for its preservation is functional, you can’t tell retention “because it is convenient” form innovation “because it is convenient” from “it has always been so”. Personal pronouns are short, refer to what your listener has in mind already and for the second person the referent is addressee (that is, she is even implied).

    In other words, they are quite like movable topics*. Cf. also definiteness: it is not the same as topicalisation, but close enough for Russian not to need one when we have the other. “Me” and “you” (not even anaphoric) are closer.

    No wonder that it is the verb in the position of focus here, because yes, it is “my love” that is what I am reminding of here (I can’t say “new information”, usually it is not new:-)). When the reason is morphophonological, the argument for retention is stronger.


    It is my little Russocentric revenge.

  105. David Eddyshaw says

    Here you support the authors

    Except that I then cite Güldemann’s well-argued paper which suggests that this SOV system itself actually arose from SVO, with nary a whiff of diffusion anywhere.

    I wouldn’t dispute that SVO can arise from SOV as they suppose, only the idea that it’s some sort of one-way ratchet, all counterexamples being handwaved away as “diffusion.”

    Turning the diffusion argument around, as Lameen suggested above:

    Now I think of it, the first clear example that occurred to me outside Europe of SOV -> SVO (as opposed to SVO -> SOV) was Timbuktu Songay, which actually looks awfully like a case due to diffusion (from e.g. Fulfulde) and/or “semicreolisation.”

    (Was Vulgar Latin “semicreolised”? Koine Greek is more SVO than previous stages of the language too. Diffusion? There was presumably a good bit of large-scale imperfect acquisition of Greek going on in Hellenistic times.)

  106. John McWhorter has long been claiming […] But he’s Just Wrong.

    OK, snark. But he’s not the only one. I have a stack of pdfs from the 2000s (none of which I’ve read beyond glancing) making various claims for syntactic tracks of Celtic in English, by Schrijver, Filppula, Lutz, Poppe, Hickey, and others. A 2012 book by Miller, External influences on English: From its beginnings to the Renaissance (which I haen’t seen) apparently canonizes this view. But, the reason I haven’t read any of these papers is that I don’t have the background to tell if any of them make sense or not.

  107. “well-argued paper which suggests that this SOV system itself actually arose from SVO, ”

    David, sorry, I missed that point:(

    But I am perplexed with the claim.

  108. Here, 219:

    James Tai (1976) has observed that SOV word order appears not to arise through internal devlopments …

    Then Campbell and others quote examples where it arises from contact. Of these Chinese is suspicius, Munda is not, Akkadian is hard to evaluate. And continue:

    However we naturally do not support Tai’s hypothesis, since it raises a nagging question: if new SOV langauges arise only from contact with older SOV langauges, where did the prior SOV lnaguages come from, and if they too are assumed to have arisen from contact with SOV langauges, then how did the very first SOV language came about.

    That is : “It is consistent with everything we know, but we do not believe in it because 2+2=5”.

  109. Lars Mathiesen says

    I think I spotted a hidden assumption: Even if genetics strongly indicates that the pre-Columbians all descended from 70 individuals who came from a Beringia that had been isolated for 10K years — who says they came as a group? Small family groups striking out on their own, hunters lost at sea, all speaking different languages, a few of them surviving in different locations and founding separate language families, is that possible?

  110. David Eddyshaw says

    @Y:

    Re snark:

    My main problem with it is that the timescales don’t work without positing either that large numbers of Brythonic speakers unknown to contemporary history survived in intimate contact with English speakers up until do-support really began to take off in English; or that English speakers had been secretly using do-support for centuries but keeping it out of all written material because Of Course You Would Because It’s Common. Or something.

    Moreover, although (modern spoken) Welsh does use its do-verb a lot as an auxiliary it does so quite differently from English: it’s mostly a way of avoiding using the inflected forms of lexical verbs in the future and past tenses; it has no emphatic sense in positive statements and is not required in negative statements or questions. And it isn’t used for the present tense. Apart from these minor points, the usage is exactly like English.

    Furthermore, quite a lot of this seems to be a late development in Welsh as well, though admittedly written Welsh in the past has always tended to be highly conservative and unrepresentative of real contemporary speech, so I don’t think I can press that point too far …

    McWhorter somewhere claims that do-support is so cross-linguistically exceptional that its presence in English just has to be due to a substratum; but his premise is incorrect. It’s not that exceptional at all. I think we discussed this before, and David M came up with examples in German …

    Even if he weren’t wrong, alarm bells go off for me whenever somebody couches an argument in the form “this linguistic feature is so weird that it can only be due to [researcher’s particular pet hobby-horse].” A lot of bad arguments seem to take this form …

  111. David Eddyshaw says

    Middle Welsh does use “do” very often in periphrasis for past tenses, though. And Breton seems to use its “do” verb a lot as an auxiliary too. (drasvi may well know more about that than I do.) So I may well be underestimating how old this is in Brythonic. It still all seems about as different from English as it could be, given that it all involves “do” as an auxiliary.

    But another point against my own position could be that do-support in positive statements doesn’t seem to have had the current emphatic sense originally, when the thing was getting started in English: so the negative/interrogative requirement for do-support part is a later intra-English development, and that particular striking difference from Welsh would not itself be an argument against the substratum idea. (On the other hand, if your evidence gets so attenuated that all Brythonic and English have in common is “they both developed the use of ‘do’ as an auxiliary verb in some rather different contexts”, this doesn’t seem like particularly powerful evidence for a substratum …)

  112. David Eddyshaw says

    But I am perplexed with the claim.

    Güldemann does a pretty good job of explaining it in the actual paper that I linked to.

    I was favourably struck especially by the fact that his argument is not all made-up thought-experiment stuff: he points to actual real languages that may preserve intermediate steps along the way to the Bantu system.

    (It’s not a million miles away from what I suggested above for Mande.)

  113. David, sorry, I was in “thinking aloud” mode and I meant the claim that transitions to SOV need diffusion. I only recently was able* to read the full paper. Thinking aloud happens to me, but in this case what I wrote looked as if I am perplexed by the claim about Bantu:(

    As to diffusion, I of course share everyone’s doubts.

    I am perplexed because the claim is interesting and it is, by its nature, statistical. “Statistical”: “We have sample of size N of languages {a, b, c, d …. zx, zy), and in this sample we see this many transitions of this kind and this many transitions of that kind”.

    Yet famous people support this claim or object to it, and no one ever speaks about the sample size. The same here. A good objection is: “the sample is not representative” or “it is representative, but the stats are different”.

    *well, I could do it earlier, but I expected much of the paper not to be very interesting.

    It’s not a million miles away from what I suggested above for Mande.)
    Yes, I noticed this:)
    (I mean, Güldemann’s claim is interesting too:-E)

  114. Do-support elsewhere in Germanic is discussed starting here.

    However, however* do-support became a prominent feature of English, it has evolved substantially in ways that are particularly characteristic of English. The behavior of the verb do has been assimilated to be almost exactly the same as that of an inherited Germanic modal. Besides taking bare infinitives, it can be used in negations and to open questions. About the only modal feature missing** is that it is still inflected in the third person singular: We say, “It does,” not, *”It do,” in standard English. So however modal do entered the language, it has undergone a lot of English-specific evolution. This may make it challenging to distinguish whether the construction was inherited from Germanic, and do-support evolved in parallel with the other surviving Germanic modals, or if it was calqued from Celtic then regularized to match English’s other modal constructions.

    * Heh.

    ** I suppose there is also that the past tense is spelled “did,” not “dould.”

  115. Unify modality and moods, regularize it to “I shall – I should, I call – I could, I will – I would, I doll – I dould“, classify “reality” as a subtype of irrealis.

  116. Last I checked, SOV has a known “primary” source in creolization and/or pidginization. And I do find intriguing McWhorter’s idea (oh hey that guy again) that some particularly isolating language groups indeed originate via ancient creolization, by now also followed e.g. by DeLancey for Sino-Tibetan. Was this discussed at Chez Hat already or is my memory confusing this with some other venue?

    founding separate language families, is that possible?

    Mostly agreed. People do not quite “found separate language families” though (unless they’re going full Tolkien), they merely extend whatever language family they already were a part of.

    Solid nonexistence of Amerind really most believably requires some supposed members to rather have non-American relatives, in the way how Na-Dene is probably most closely related to Yeniseian or Eskaleut is probably more closely related to Uralic and allies than anything American. It’s neither enough for multiple languages to have entered America, for multiple languages to have been present in Beringia, for multiple languages to have entered there, nor even all three taken together: we even require that at least one lineage (out of at least two) incoming from Siberia and going all the way into the Americas to have left other descendants elsewhere. We are not out of non-preposterous candidates for this yet though (e.g. Nikolaev’s proposed Nivkh–Algic–Wakashan would accomplish exactly this).

    On the third hand yet though, it seems to me that even some situation where (for the sake of completely making things up), Amerind+ splits into Paleoamerind+ and Neoamerind+, of which Paleo splits into Kamchukotkan–NivkhAlgicWakashan and Coastal Amerind, while Neo splits into Ainu and Inland Amerind … would still be almost the same claim as the existence of Greenberg’s Amerind. Or maybe, since we probably don’t want to nail down any of his reconstructions, “Amerind with Greenbergian borders”.

    Now if you were to get back to me instead about the existence of something like Quechumaran–Hokan–Austronesian–Turkic which is moreover a sister group to Arawakan–Zuni–Macrosiouan–Salishan–Eskaleut–Burushaski (but to the clear exclusion of anything else in the Americas), sure at that kind of a point I would grant Amerind to have been shown to be complete fiction.

  117. I had a look at Greenberg’s Amerind vocabularies for the first time in decades. I have to say, glancing at any random page in that book, or at the Ruhlen-revised Amerind Etymologies book, makes my eyes sweat on the inside. It is awful, rotten, execrable, at a glance. There are some long-ranger speculative papers which make you at least stop and think, and say to yourself, “now, I wonder if that supposed etymology has something to it.” LIA is not like that. It’s just Bad, far worse than I’d remembered.

  118. marie-lucie says

    Y, I could have said exactly the same! (Perhaps I know the Ruhlen revision under another name?).

  119. Y: … makes my eyes sweat on the inside.

    David Eddyshaw can, of course, say more authoritatively, but this sounds to me like a potentially very serious medical issue.

  120. marie-lucie says

    Such are the effects the Amerind books have on unsuspecting linguists.

  121. David Eddyshaw says

    This is merely the familiar phenomenon of Ocular Sauna (from *stakna.) Should be fine. It’s just the body’s way of protecting itself by preventing the harmful images from reaching the brain and causing terminal addling.

    If you read Sapir’s Language for a bit the problem will resolve rapidly without permanent sequelae.

  122. SFReader says

    would still be almost the same claim as the existence of Greenberg’s Amerind

    Of course, Amerind dated to 15 000 BP and Amerind dated to 20 000 BP are going to differ in terms of possibility of reliable reconstruction.

  123. Solid nonexistence of Amerind really most believably requires some supposed members to rather have non-American relatives
    Demonstrating that (with your further conditions that I didn’t quote for the sake of brevity) would most clearly show that there was no Amerind. But it is not a necessary condition – Amerind would also be an invalid grouping if the concerned languages belonged to families that branch out of Proto-World at the same level as all other existing high level families, or would go back to languages sprang up at different points if you don’t assume monogenesis, even if none of the “Amerind” groupings had relatives outside America.

  124. maidhc says:

    “But how likely is it that a dominant language would pick up just structure from a substrate language and not any vocabulary?

    I didn’t ask that question to cast shade on the idea, but because I don’t know the answer and I wanted to know what the collective expertise would have to say.

    I found SFReader’s response particularly enlightening, so thank you for that.

    ——-
    maidhc: before accepting SFReader’s response, it would be good to consider my reaction to that response (30 July 2021, 8:57 am):

    @SFReader “Irish English borrowed many Irish constructions including even the Irish perfect tense (but there is relatively little Irish vocabulary in Irish English).”

    The latest edition (2013) of Terence Patrick Dolan’s Dictionary of Hiberno-English: The Irish Use of English has 292 pages. Discounting the preliminaries (= American English front matter), that would still leave a significant number of pages devoted to Irish English vocabulary of Irish origin.

    Can you estimate the number of Irish English constructions of Irish origin after discounting those, if any, limited to the English of native speakers of Irish?

  125. PlasticPaddy says

    @Suzanne
    Since no one responded to you in a specific way, I think we are talking about a substrate and particularly about people whose L1 is the substrate but who try to convert or at least convert their children to speakers of a second language. This process can end (or stabilise) in various ways. The end we are talking about is where the converted speakers influence their new L1 (or remain separate from other L1 speakers of it and preserve their peculiarities). Since no one has been exhaustively recording the state of their new L1 over time and since the substrate is often a dead and unwritten language, it is an intractable problem to make any kind of statistical analysis of the influence of the substrate on the L1, apart from identifying vocabulary clusters or niches (e.g., names of birds, trees, borrowed agricultural or industrial, etc., terms with no etymology and sharing some peculiar feature). So i do not think anyone is really trying to do this kind of statistics (or if they are, it will most certainly involve a series of unprovable assumptions or an improper use of the mathematical tools). Even in your case of Hiberno-English it would be difficult to the point of intractability. For instance is ’tis/twas’ (drop of initial i in 3p declarative copula forms) a reflex of Irish (i)s é/í/ea or a survival of an English habit attested in writing but not in current speech?

  126. Suzanne:

    before accepting SFReader’s response, it would be good to consider my reaction to that response

    I considered all the responses. My original question was something like “Here’s this theory about something that happened ~3000 years ago, is it even plausible?” I take some of the answers to be “Well, yes, it is plausible, but it is pretty difficult to tell what happened ~3000 years ago”. Linguistically of course, because archaeology can tell us a lot about physical objects, but not about language.

    Can you estimate the number of Irish English constructions of Irish origin after discounting those, if any, limited to the English of native speakers of Irish?

    Well no, obviously not, because I don’t have the resources to do such a thing.

    But I think the question is more like “Will the Corkonians of the year 3021 still be talking about crubeens“?

    Maybe they would. I’d love to be around to find out.

  127. On 30 July, SFReader said, “In most cases, speakers of dominant languages are not colonial settlers with dominant position in the society, but rather opposite – colonized people forced to switch to the language of their oppressors. It is very logical for them to speak the dominant language using a lot of syntactic structures from their own language”,

    and backed up that generalization with one example (“Case in point:…”).

    My reaction on reading the generalization and the case in point was that (1) neither had been proven and (2) “Extraordinary claims require extraordinary evidence” (Carl Sagan).

    Since a comment board is hardly the place for proof of the generalization, I decided to ask only about the example: “Can you estimate the number of Irish English constructions of Irish origin […]” (30 July).

    On 31 July, maidhc said, “I found SFReader’s response particularly enlightening,” which I took to be an expression of agreement with the generalization and the example.

    I found maidhc’s agreement unusual since no one had adduced evidence that the generalization or even just the example was right. Hence my comment on 31 July calling attention to my skepticism expressed the day before.

    PlasticPaddy says (1 August), “Since no one has been exhaustively recording the state of their new L1 over time and since the substrate is often a dead and unwritten language, it is an intractable problem to make any kind of statistical analysis of the influence of the substrate on the L1.”

    Maidhc says on the same day, “Well no, obviously not, because I don’t have the resources to do such a thing.”

    Are the three of us now agreed that no proof of the generalization or of the example has so far been adduced on this message board?

  128. families that branch out of Proto-World at the same level as all other existing high level families

    That would already amount to having non-American relatives. Though I do not think a single Babelesque megapolytomy at the root of Proto-World or even Proto-Exo-African to be an especially compelling hypothesis a priori.

    (Which also suggests that before worrying about Amerind too much, it would be better to work on finding some smaller groupings that can be reconstructed more productively. E.g. I don’t even know which are the primary N-M families and if any of them look like there could be more to be done on them? — Reading the Zamponi paper right now though.)

  129. David Eddyshaw says

    @Suzanne:

    There’s been a lot of work relevant to this topic done with creoles. Etienne has particular expertise in this area, I think, but until he shows up:

    There is a school of thought that the English-lexifier Atlantic creoles are basically Fon but with English words. A notable proponent of this idea has been Claire Lefebvre (joint author of my least favourite grammar of a West African language ever.)

    This idea actually appealed to me at one stage, because if you know any West African language you tend to be immediately struck by how much Nigerian Pidgin (for example) fits in syntactically. If the idea were valid, it would be a perfect example of the phenomenon in question: vocabulary almost entirely from the colonial language, syntax pretty much entirely from the substrate.

    However (like a lot in creole studies) this is controversial. On the one hand, a lot of features typical of creoles seem to be widely shared regardless of the origin of the creole, and may thus just reflect some basic blueprints for how human beings repair a partial language like a pidgin. (It just so happens that Fon belongs to a zone of West African languages that happen to be rather like that anyway; John McWhorter elaborated a theory that this meant that they were themselves ultimately of creole origin!)

    Another point of controversy has been over the question as to how far the lexifier language already deviated from the standard that people were wrongly using as a point of comparison. This (I think) has been a big issue with Haitian Creole, much complicated (alas) by political concerns.

    There’s a related (but distinct) concept of “semicreolisation.” This represents languages as having got simplified somewhat if their histories at some point involved large-scale adoption by adults with other L1s. This is basically what McWhorter thinks has been going on with English, and that this “explains”, for example, the relative morphological simplicity of English compared with its close relatives.

    “Colonised people forced to switch to the language of their oppressors”, by the way, is only one point on a scale which goes from slaves completely abducted from their own linguistic environment to free citizens who just want to communicate with people who live farther away than the next village and learn the local lingua franca perfectly voluntarily (if, perhaps, imperfectly.)

    As I say, I hope Etienne will chime in on this, as he knows much more about it than I do.

  130. PlasticPaddy says

    @Suzanne
    What do you regard as proof and what do you do with statements like “the Sun will rise tomorrow”, which is only suitable to statistical analysis if one makes the unprovable assumption that the observed result for a statistically significant number of other days can be extrapolated to tomorrow?

  131. families that branch out of Proto-World at the same level as all other existing high level families

    That would already amount to having non-American relatives.
    Maybe we are talking past each other. I understood you that way that only the following situation would mean that there was no Amerind (groupings just for illustration, not postulating any of them):
    (Amerind 1 + Sumerian) (Amerind 2 + Ainu) (Dene-Caucasian) (Nostratic) etc. => language groupings that are claimed for Amerind are related closer to some old world family than they are to other language groupings that are claimed for Amerind.
    What I was saying is that also in the following situation no Amerind would exist:
    (Amerind 1) (Amerind 2) (Dene-Caucasian) (Nostratic) etc., i.e. various high-level groups claimed for Amerind branch neither with other “Amerind” groupings nor with old world groups. In both cases the Amerind languages are related with old world languages, but in the second only through the upmost node.

  132. Lars Mathiesen says

    @J Pystynen, when I said “founding separate language families,” I meant in the context of the Americas. Also, if Beringia was genetically isolated for 10K years, even if we could know how many language families it supported, the likelihood of any of those that reached the Americas surviving elsewhere must be pretty low. Maybe groups with better toolkits took over their ancestral lands, maybe the ones left behind chose to adopt another language — and then Beringia sank beneath the waves, preventing us from finding traces there.

    If we actually find a connection like Dene-Yeneseian we should count ourselves lucky, and maybe lend more weight to theories of population movement that make it more likely. (In Physics I they taught us about the maximum likelihood estimator — a logical fallacy, but sometimes useful).

  133. I agree that your (Hans’) scenario would also mean that Amerind does not exist, it just has the practical problem that it’s basically impossible to positively demonstrate polytomies from only knowing a set of daughter lineages.

  134. “and maybe lend more weight to theories of population movement that make it more likely.”

    It is just Bayes’ theorem…

    (Amerind 1 + Sumerian) (Amerind 2 + Ainu) (Dene-Caucasian) (Nostratic) etc. => language groupings that are claimed for Amerind are related closer to some old world family than they are to other language groupings that are claimed for Amerind.

    This would prove polyphyly of the grouping – and also considerable age of the most recent ancestor.

  135. What one needs is either large European-African language area (and then you’ll have to explain why Celtic) or just an Atlantic area extended to Morocco. An Atlantic area in Europe is archaeologically plausible, Morocco-Spain is geographically plausible. The next question is what genetics says.

    Genetics says no. That may not have been clear ten years ago, but it is now.

    Actually there was a study from two sites in Morocco (5000 and 3000 BC) and one site in Iberia (5000) that found European admixture in the younger Moroccan site.

    (it is not the direction that everyone wanted of course:-)).

  136. David Eddyshaw: You called?

    Okay, I guess I will have to move from a very interested reader of this thread to an active participant.

    1-On your comment of July 29 at 10:233, specifically: “There is no reason to suppose that Insular Celtic passed through a SVO stage on the way to VSO”: I beg to differ. Among the Continental Celtic languages Gaulish is basically SVO and Celtiberian is basically SOV: because SOV seems to be the Proto-Indo-European order the Celtiberian order is taken to be the older, Proto-Celtic one, with Gaulish SVO being assumed to be an innovation. Because Gaulish shares a number of innovations both with Insular Celtic in general and with Brythonic Celtic, whereas the total number of shared innovations uniting Celtiberian with Insular Celtic is zero, it is perfectly reasonable to suppose that the shift to SVO was likewise originally shared between Gaulish and Insular Celtic.

    2-On your reply to Suzanne: First of all, Claire Lefebvre has mostly (only?) worked on Haitian Creole as opposed to other creole languages, and her “theory” that Haitian creole is in fact a language with Fongbe grammar and French vocabulary has not met with general acceptance within or without creolistics.

    Second of all, a core reason for this non-acceptance of her theory involves the fact that creole languages show strong similarities world-wide, *whatever the substrate language(s) involved*, making it heuristically difficult if not impossible to see substrate languages as having played a key or even an important role in creole genesis, since these substrate languages differed far more radically from one another than creole languages do.

    Third of all, whenever you do find indubitable examples of outside influence upon a creole, these are typically features which entered an already existing pidgin or creole, and NOT something which played a key role in the genesis of said pidgin/creole.

    A well-known example of this is Saramaccan, in Suriname: this is a creole language with a large number of African (Gbe) features, including some basic vocabulary: but comparative data from other Surinamese creoles (Sranan, Ndjuka) indicates that all three creoles must have had a common ancestor (which may have been a pidgin or a creole, we do not know) with mostly English-origin morphemes, with Saramaccan owing its Gbe elements to later contact. Another, less well-known example involves the Portuguese creoles of the Gulf of Guinea (Sao Tomense, Principense, Fa d’Ambu and Angolar): All four must have a common ancestor, and the most heavily African-influenced of these creoles, Angolar, owes these African elements to later contact, and not to “Proto-Gulf of Guinea Portuguese Pidgin/Creole”.

    Fourth, the notion that non-standard varieties of the various European languages, in colonial times, were already creole-like to a degree is simply a consequence of a certain school of “thought” within creolistics which vehemently and (to my mind) irrationally rejects pidginization as an explanatory tool, and which thus will use anything other than pidginization to explain creole structure. It is wholly circular,of course, since the evidence (leaving aside the creole languages themselves) that colonial-era non-standard forms of the European languages were significantly creole-like boils down to…well, nothing, actually.

    Fifth, “semi-creolization” is even harder to define than “creolization”, and I for one see “semi-creolization” as being quite unlike large-scale L2 acquisition: the latter phenomenon need not involve pidginization, whereas to my mind the former does, differing from “creolization” TOUT COURT in yielding a speech variety typologically intermediate between the European language and a creole. But, to repeat myself, this is my own take: many a competent creolist will disagree with me on this point.

    3-On the notion of an Afroasiatic substrate in Insular Celtic: the absence of any Afroasiatic loanwords in Celtic is not in and of itself enough of an objection: Far more serious to my mind is A-The fact that VSO seems to have emerged very late in Insular Celtic (see above), B-The fact that no evidence of Afroasiatic place-names anywhere in the British Isles (or, indeed, Western Europe) has been found, making these Afroasiatic speakers amazingly discreet compared to other substrate language speakers, C-The fact that Basque shows no trace of the Insular Celtic-Afroasiatic shared features, and seems to lack a stratum of Afroasiatic loanwords too.

  137. But there were some subsrate language speakers. Irish people are local, there was not replacement of population.

    no evidence of Afroasiatic place-names
    I think you have Semitic in mind or “known families of AA”: there are features that let one tell if a language is Afroasiatic or not (not always:)), but they do not help reconstruction, even hinder it.

    If AA worked like Indo-European, Semites would have invented historical linguistics in the Middle Ages, and we would have forgotten this occupation, pedantic and pointless, together with astrology. Drawing tables fo sound correspondences. Brr….

  138. Если гора не идёт к Магомету….

    If Irish English is not accepted and my English не принимается*, let’s try Russian Russian.

    Russian has been claimed to be enriched by Finnic grammar but not vocabulary. But such borrowings can not be proven. People will make a ball out of the burden of proof and play ping-pong (as it happened in this thread with other topics).

    I like Irish English, because we can see the process…


    *I am not tempted to code-switch when I am addressing native speakers. Why would I use words that no one understands? So maybe we also can look at the question from human perspective.

  139. In the Iberian peninsula, the sequence looks roughly as follows: original Paleolithic inhabitants followed by arrival of Neolithic settlers (ultimately from Levant via Asia Minor, Greece and Italy) followed by Vasconic speakers (the latest dates I’ve seen attribute their arrival sometime in the 3rd millenium BC) followed by Celts (starting from 1200 BC) followed by Romans (from 206 BC).

    This, of course, leaves no room for direct contact between Celts and the first Neolithic farmers. There is a gap of approximately 2000 years between them.

    So regardless of whether the first Neolithic settlers of Western Europe were Afro-Asiatic or Hattic speakers (or something else entirely), we won’t know that from Celtic languages – they came too late.

  140. There’s a related (but distinct) concept of “semicreolisation.” This represents languages as having got simplified somewhat if their histories at some point involved large-scale adoption by adults with other L1s. This is basically what McWhorter thinks has been going on with English, and that this “explains”, for example, the relative morphological simplicity of English compared with its close relatives.

    (I note @Etiene’s poo-pooing the idea of “semi-creolisation”.)

    If speakers are simplifying, why is it morphology that gets simplified as opposed to word-order? If the “other L1s” are highly inflected or highly agglutinative, won’t continuing that seem ‘simpler’ than introducing a bunch of auxiliary verbs and discontinuous elements? (Like English ‘does’.)

    What “other L1s” does McWhorter think English-creole-speakers came from? All the same (family)?

  141. About 20 thousand Norman French came with William the Conqueror and they (or rather their children) eventually switched to English.

    While numerically small, they had disproportionate impact on English language, because they formed the new elite of England.

    Even larger numbers of Danes/Norse came in preceding two centuries and settled on eastern coast of England with similar results.

  142. David Eddyshaw says

    @AntC:

    I hold no brief for McWhorter’s views on English, which strike me as interesting speculation at best.

    “Semicreolisation” is a controversial concept, but some sensible people do in fact believe in such processes, as Etienne implies. I first came across the idea in Jeffrey Heath’s grammar of Timbuktu Songhay, though in a context where Heath himself promptly tells us that he discarded the notion as oversimplying what’s happened to make Western Songhay distinctive.

  143. SFReader: About 20 thousand Norman French came with William the Conqueror… Previously, Anglo-Saxons occupied the east coast of Britain and gradually subsumed the kingdoms of the north and west, the Hen Ogledd (“Old North”), the Brittonic-speaking region of what is now northern England and southern Scotland, around 700-900 AD. This may have influenced the speech of the Anglo-Saxons, but the Danes coming in to most of the same area had a much larger influence. The border determined by the Treaty of Alfred and Guthrum in 886, according to a BBC documentary, still influences speech patterns in modern England within a boundary of a few miles.

    In the Iberian peninsula, the sequence looks roughly as follows: original Paleolithic inhabitants followed by arrival of Neolithic settlers (ultimately from Levant via Asia Minor, Greece and Italy) followed by Vasconic speakers (the latest dates I’ve seen attribute their arrival sometime in the 3rd millennium BC) followed by Celts (starting from 1200 BC) followed by Romans (from 206 BC).

    What I get from my admittedly limited reading is that the story of the Vasconic languages is not at all clear, and at any rate did not penetrate into Gaul and Britain. So the Celts might possibly have encountered the Neolithic peoples in these places. And (from previous discussions here) Iberia when the Romans arrived was a linguistic patchwork, including languages that may have vanished without trace.

    As I said earlier, I don’t know that a definitive answer may be possible, but I’m just trying to understand different theories.

    I must say, though, this thread is very interesting. I get a lot out of hearing a variety of viewpoints.

  144. Lars Mathiesen says

    Bayes’ Rule — well, that one is true by the definition of conditional probability as long as that’s what you do, what scares me is the remark on the WP page that on the right hand side, the likelihood function can be substituted for the probability “because L(A|B) = P(B|A)” — and that’s exactly what I called a sometimes useful logical fallacy above.

    Concretely, we may prefer the story that gives the highest likelihood of finding Dene-Yeneseian, but that doesn’t mean it’s sensu stricto more probable than others. We would need to know a lot more about how probable each outcome is given the (unknown) starting conditions to say that. Ultimately, you can’t calculate probablilities without a known starting state or some thorough chaos.

    (I’m probably not a Bayesian then, but all the explanations of Bayesianism that I’ve seen have made my eyes sweat on the inside).

  145. Lars, it reminded me how my freind asked another freind of ours to solve the Monty Hall problem* (she was offering it to everyone).

    This another friend, he is a flute player and his relationship with math is… crazy. It is not “bad”, it is another world. Once he was counting money (he needed to return the change from a sum spent on wine and he wanted to return as little as possible) and his reasoning made me spent some 5 minutes laughing like crazy. So when my another freind said she wants to try in on him, I did not know what it will look like, but knew it will be fun. She said, she will explain eveyrthing to him with experiments.

    He, of course offered an explanation that she did not accept and she proceeded to experiments (she loves them). In Russian formulation it is not goats, so we used teabags and boxes. The guy was guessing everything right. When after Nth experiment she yelled “how!!! How do you do it! Explain!” he said, “if there is a bag in a box, there is a probability, and if there is no bag in the box then there is not a probability”.

    (I know, it is not fun to read – it was fun to observe – but I believe he understands more about probabilities)


    *

    Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do you want to pick door No. 2?” Is it to your advantage to switch your choice?

  146. David Eddyshaw says

    Statistics is a useful way of quantifying our ignorance.

    Unfortunately, you need to make guesses about the nature of our ignorance in order to get anywhere …

  147. I always guess that the nature of our ignorance is deep.

  148. David Eddyshaw says

    Seems a safe assumption …
    (On average, of course.)

  149. I haven’t gotten a goat yet.

  150. David Eddyshaw says

    But there is another method …

    (I’ve just spent a happy quarter of an hour trying to remember who used these words of Olive Schreiner’s as an epigraph. Eventually got it, Guglo juvante.)

  151. Context (and I thank you for introducing me to it):

    Human life may be painted according to two methods. There is the stage method. According to that each character is duly marshalled at first, and ticketed; we know with an immutable certainty that at the right crises each one will reappear and act his part, and, when the curtain falls, all will stand before it bowing. There is a sense of satisfaction in this, and of completeness. But there is another method–the method of the life we all lead. Here nothing can be prophesied. There is a strange coming and going of feet. Men appear, act and re-act upon each other, and pass away. When the crisis comes the man who would fit it does not return. When the curtain falls no one is ready. When the footlights are brightest they are blown out; and what the name of the play is no one knows. If there sits a spectator who knows, he sits so high that the players in the gaslight cannot hear his breathing. Life may be painted according to either method; but the methods are different. The canons of criticism that bear upon the one cut cruelly upon the other.

    Excellent stuff. Who used it as an epigraph?

  152. David Eddyshaw says

    John Berryman; to his wonderful Dream Songs.

  153. Ah, of course. Thanks.

  154. (Why do I have two copies of The Dream Songs when so many poor wights have none?)

  155. @AntC: That should be “Etienne’s pooh-poohing.” “Poo-pooing” is something else entirely, and totally unsanitary for a non-fetish blog.

    (Blackadder on the subject of pooh-poohs.)

  156. WP: “A pooh-pooh (also styled as poo-poo) is a fallacy in informal logic that consists of dismissing an argument as being unworthy of serious consideration.” Wiktionary has both meanigns for “poo”.

    But Baby Talk is a good topic, I think. Why everyone (Russian, Dutch, Latin…) has /kak/ and those guys across the strait have /pu/? Is it AA substrate?

  157. David Eddyshaw says

    Welsh has cachu, so that must be the AA substrate …

    The only possible conclusion is that Russian has an AA substrate too. Do you have agriculture in Russia? That would explain everything

  158. John Cowan says

    “Semicreolisation” is a controversial concept, but some sensible people do in fact believe in such processes, as Etienne implies.

    Indeed, I will be the fool (who rushes in where Etienne is careful about treading) and say that I think that if you look at Standard French, Reunionnais, and Marsyen, you have a standard language, a semicreole, and a creole respectively.

    why is it morphology that gets simplified

    Because other people’s morphology is (very broadly stated) unlearnable. Cf. Trudgill’s tale of Norfolk (UK) English and the Spanish Inquisition.

    I’m probably not a Bayesian then

    Distinguo: Bayes’s theorem is true. Bayesian inference is a useful tool. Bayesians are fanatics. Bayesianism is a false religion.


  159. Or is it generalization of the paradigm of “pee”?

    Can I publish “On the grammatical cathegory of excretion” and sign the paper “Tartaretus”?

    With further “new insights in the origin of grammar in baby talk”, with references to Tartaretus-1, p 22, Tartaretus “On the grammatical cathegory”, and Chomsky (as myself)? Or it is too functionalist?

  160. Or is it generalization of the paradigm of “pee”?
    Cf. Russian по-большому ‘in a big way’ and по-маленькому ‘in a little way’. Now I am reminded about the distinction masc./fem. o/e discussed in the Omotic paper. Associating sounds with sizes and genders is cross-linguistical thing, of course, but he also connects it to Afroasiatic k/t.

    Do you have agriculture in Russia? – We used to, but we migrated to cities. I do not know. My friend’s husband’s native villae is half-Yazidi by now. They even sent “a million” to Syria, before Russian intervention.
    I guess a million in Russian hacksilber (~30k), but anyway: I do not think it is money that their women received as lone mothers (they do no marry.. officially, so they all are lone mothers), so must be agriculture or pastoralism…

  161. P.S. I just learned what is “hacksilber”, and importantly what it looks like.

    And decided to write “hacksilber” instead of “rouble” (because it is the translation of the Russian word, rouble literally means ‘hackle’)

  162. I just imagined “rouble” as a silver wire or a silver bar carried and conveniently hacked into pieces of proper size before payment, when you want to uise it as a currency.

    I did not imagine that instead you can hack a beautiful cover of gospels from a monastery that you have pillaged or a bowl of multi-coloured silver of very fine work with pictures.

    I know that usually we melt it instead and it is even more barbarian, but as usual, it is much less obvious

    And making a commodity out of unique and pretty things and increacing their value because no one needs unique things! Mindblowing.

  163. Lars Mathiesen says

    I have a feeling that the Monty Hall “paradox” could be used to expose the difference between Bayesian reasoning and non-, and that might throw some light on why it’s so hard to wrap our brains around the “proper” solution when faltering on the narrow path of keeping the symbols we manipulate devoid of meaning. But I have -44 minutes left before bedtime, so it will not happen tonight.

  164. Ah, about my friend: Only once he changed his decision. He picked a box, Nina (it is easier with names) removed one of the boxes, and he still insisted on opening the first box, and the bag was there.

    Once he changed his decision. and the bag was there (in the box Nina opened) too.

    Needless to say, “the right” solution that Nina aspired to demonstrate with experiment was thoroughly disproven, experimentally (which satisfied everyone anyway).

    It is another property of this problem: omniscient people will always play against the rules (they have to) and win anyway.

  165. The only possible conclusion is that Russian has an AA substrate too. Do you have agriculture in Russia? That would explain everything …

    https://upload.wikimedia.org/wikipedia/commons/2/2d/Neolithic_Expansion.gif

    See the black arrow across the Caucasus.

    I wonder if we could AA substrate in China.

  166. Lars Mathiesen says

    Actually, doing the math — in the case of Monty Hall, the probability of the car being behind a specific door once you know who opened what doors (assuming that it’s placed randomly at the start) is proportional to the likelihoods of those doors being opened given a specific placement of the car. So it’s not a Bayesian fallacy that makes this so counter-intuitive.

    (Actually you have to make assumptions about the behaviour of both host and contestant, specifically that they pick a random door when they can, otherwise the likelihoods will be undefined or skewed. But here’s a Kunstgriff in the most non-pejorative sense: because the situation is totally symmetric you can average over all permutations of the door numbers, and the behaviours of the contestant and the host will cease to matter — as long as the probability that they actually pick a door is 1).

  167. Etienne: devastating response re Celtic/AA, but have you come across
    https://www.caitlingreen.org/2016/12/punic-names-britain.html?

    She’s an archaeologist, the people she quotes are place name specialists; I don’t know what the comparative linguisticians of LH would think of their expertise.

  168. It’s well beyond my ability to gauge the linguistic argument, and hard not to believe chance would lead to a few such similarities, but it’s worth pointing out that most (all?) who advance such theories think the names and coins are a result of Punic trade with Celtic Britain, maybe even minor colonization, but not a Punic population predating the Celtic and providing an A/A substrate.

    I would suggest one merely statistical point. Latin etymologies survive in many 3+ syllable place names in Britain. It’s interesting that nearly everything she points to is a 1-syllable word, a level at which coincidences are much more likely. And if there were Punic survivals, I’d be surprised that the common Semitic geographical term ras wouldn’t have left traces somewhere, given how common it is in other Semitic coastal settings, and how common the usage is in English and insular Celtic (Penzance, the Pen-y-fan at Dinas Head, which you should all visit, in Pembrokeshire, from Penfro).

  169. David Eddyshaw says

    Coates is no charlatan, certainly, though I am perhaps biased by his sharing my scepticism about Celtic substrata, e.g.

    https://web.archive.org/web/20160314053126/https://www.sussex.ac.uk/webteam/gateway/file.php?name=rc-britons.pdf&site=1

  170. David L. Gold says

    Caitlin Green speaks of “numismatic evidence for contact between North Africa/the Mediterranean and pre-Roman Britain.”

    Yes, contact, but of what kind? The presence in place X of a coin struck in place Y is evidence for one of the following:

    (1) migration of one or more persons from place Y to place X and settlement there,

    (2) the temporary presence of a trader or a merchant from place Y in place X,

    (3) the passing of a coin struck in place Y by any number of traders and/or merchants until it ended up in place Y (say, a trader in Greece passes a Greek coin to a trader in Rome; the trader in Rome later passes it to a trader in Marseille, etc., until the coin ends up, say, in London, where it is lost and remains buried until an archeologist finds it).

    It is easy to find statements such as the following in the research literature:

    “Four thousand one hundred Roman coins have been found in Gotland alone” (Dina P. Dobson, “Roman Influence in the North,” Greece & Rome, vol. 5, no. 14, February 1936, pp. 73-89; the quotation is from page 79; the article is available on JSTOR).

    In “Byzantine Coins in Viking-Age Northern Lands” (in F. Androshchuk J. Shepard, and M. White, eds., Byzantium and the Viking World, Uppsala, 2016, pp. 117-139), Marek Jankowiak notes that some 400,000 Islamic dirhams from the ninth and tenth centuries have been found in northern Europe (p. 117).

    The Sundveda Hoard (in Swedish, Sundvedaskatten) consists of 482 coins, one of which is Carolingian and the others were struck in North Africa, Iran, Russia, the Arabian Peninsula, and northern India. Sundveda is near Stockholm.

    If the location of coins were always definite proof of settlement, we would have to recognize that scores of peoples settled in scores of places where they in fact did not.

    Can Caitlin Green prove that the coins struck in North Africa and the Mediterranean and found in pre-Roman Britain are evidence of settlement and not of trade?

    P. S. After writing the foregoing, I saw the first paragraph in Ryan’s comment above.

  171. PlasticPaddy says

    Here are my thoughts (for what they are worth) on the first few places in the list.
    General
    Are we talking about (a) exonyms supplied by travellers or ship captains to mapmakers for illiterate and sparsely populated islands and coastal regions or (b) names given by Punic settlers who left when the tin played out or were absorbed by later populations?

    The Isle of Thanet, Kent — Tanatus, Tanatos, Tenet, Tanet, originally probably *Tanitā or similar…readily explicable as a Phoenician/Punic island-name ‘Y TNT, meaning the ‘Isle (of) Tanit’, the chief goddess of …Carthage
    1.RED FLAG: GOD NAME WITH NO GOD ARTEFACTS (Carthaginian coins are not good enough or even said to be found on the island)
    2. WHERE DID INITIAL ‘Y GO?

    Rame Head, Cornwall —…would make good sense as a derivative of the Semitic height-word *rām, compare Ramat Gan, Israel, and Ramallah, Palestine (Proto-Semitic root *rwm),
    1. CITED COMPARANDA HAVE 2ND ELEMENT

    Sark — compare Modern Arabic šarq, ‘east’, which would give good sense as Sark is the easternmost and outermost island of the Guernsey group.(13)
    1 NO COMPARANDA CITED
    2 IN PUNIC WOULD THIS ELEMENT BE AT THE END (cf. Inis Oírr), OR THE BEGINNING (cf. Eastbourne, Westham)?

    Echri (Flat Holm, Severn Estuary) — …an island-name involving Proto-Semitic *’ħr, ‘behind, back’

    1. NO COMPARANDA CITED
    2. IS ‘Y MEANT AS THE 2ND ELEMENT?
    3. EVERYTHING IS BEHIND SOMETHING ELSE.
    4. BICONSONANTAL ROOT (so coincidental match more probable than with triconsonantal root)

  172. David L. Gold says

    Caitlin Green writes, “Sark — Sargia, Serc, Serk. No known etymology in insular/European languages. The only credible explanation is an origin in the Proto-Semitic root *śrq, ‘redden; rise (as of the sun); east’, compare Modern Arabic šarq, ‘east’, which would give good sense as Sark is the easternmost and outermost island of the Guernsey group.(13).”

    Sark is indeed the easternmost of the Guernsey group, but Jersey, a far larger island than Sark, is the easternmost of the Channel Islands, so that if the easterness of any of those islands attracted the attention of name-givers, it would have likelier been big Jersey and not little Sark.

  173. David L. Gold says

    @PlasticPaddy. “Rame Head, Cornwall —…would make good sense as a derivative of the Semitic height-word *rām, compare Ramat Gan, Israel, and Ramallah, Palestine (Proto-Semitic root *rwm),1. CITED COMPARANDA HAVE 2ND ELEMENT.”

    They could successfully counter-argue that a second element is not needed because the first morpheme in Ramat Gan and Ramallah can be used as a free-standing place name meaning ‘High Place’.

    They would be right, but they would still not have proven their etymology of Rame in Rame Head.

  174. David L. Gold says

    Caitlin Green writes, “Bute — Botis in the Ravenna Cosmography. The name is root-identical with Proto-Celtic *butā, British *bot-, ‘dwelling’; however, Richard Coates considers the word *butā/*bot- to be, in fact, a direct borrowing from Proto-Semitic *but-, ‘hut’, and therefore suggests that this island-name could well be itself another surviving Proto-Semitic island-name in the Hebrides, meaning ‘dwelling island’ or similar, given the others discussed here.(22).”

    Would the fact that an island was inhabited be a strong enough stimulus to name it ‘hut’, ‘dwelling’, ‘dwelling island’, “or similar”?

  175. David Eddyshaw says

    The Welsh name of Flat Holm appears to be not Echri but Echni in Real Life.

    https://cy.m.wikipedia.org/wiki/Ynys_Echni

    Even if it had been Echri, I immediately thought of ochr “side”, which strikes me as no less implausible than “back.” (According to GPC, ochr itself is perhaps borrowed from Middle Irish, and the root was *ak “sharp, edgy.” “Rocky” seems as good a name for an island as any, if a touch unimaginative. I must say that on first principles Irish seems more likely than Punic as underlying a name for an island in Môr Hafren …)

  176. For Sark, WP has a good discussion of the weaknesses in Coates’ argument, and suggests a Norse etymology.

    Glancing at Green’s list, before looking at the details, two alarm bells go off: First, why would Phoenician names survive in the Outer Hebrides? What reason would there be for Phoenicians to explore these places, and establish a presence strong enough for their names, rather than the Celtic/Pictish names, to take hold? Second, the willingness to go back and forth in the etymologies between “Punic/Phoenician” and Proto-Semitic sounds like someone stuck their hand in the candy jar of etyma and couldn’t get it out.

  177. The only credible explanation is an origin in the Proto-Semitic root *śrq

    I automatically stop paying attention to people who say or write things like that. “The only credible explanation” is just a way to try to bully you into believing their hobbyhorse without evidence.

  178. J.W. Brewer says

    Compare Green’s “only credible explanation” to Coates’ own wording: “In Coates (1991: 73–6), I regarded the ultimate source of Sark as unknown; its early attestations suggest a root *Sarg-. One might compare PrSem *śrq ‘redden; rise (as of the sun); east’ (cf. Arabic šarq ‘east’). Sark is of course the easternmost, and outermost, island of the Guernsey geological group.”

    This helps set up his socko conclusion “with whatever diffidence these suggestions are put forward, Proto-Semitic at least provides something to consider in relation to insular toponymy, and I shall suggest below some further evidence that points in the same direction.” The man doesn’t know how to write clickbait, apparently.

  179. J.W. Brewer says

    @Y, Coates’ theory is not Punic traders in the Hebrides in the first millennium BC but the notion that the first resettlers of that area after the glaciers retreated (millennia before Celtic-speakers arrived) came up from Iberia and spoke whatever was spoken in Iberia back then, with “something Afro-Asiaticish” being offered as a plausible candidate. So when Celtic-speakers finally arrived much later, there were some local toponyms they didn’t replace.

    Note the “diffidence” in the title of his paper you can access here. https://yorkspace.library.yorku.ca/xmlui/handle/10315/3642

  180. bully

    “From 1530, as a term of endearment, probably a diminutive ( +‎ -y) of Dutch boel (“lover; brother”)”
    I did not know.
    I wonder if “to bully someone into” and also similar “to cow” are caused by association with bulls…

  181. How much detail is actually known about the Carthaginian presence in Cornwall? They were famously secretive about their insular tin sources, and my impression is that the extent of their colonization may have been difficult to establish.

  182. J.W. Brewer says

    By the way, it has come to my attention that certain toponyms in the northeastern U.S. (e.g. Bethlehem, Pa., Canaan, N.Y., Salem, N.J., etc.) appear unlikely to have West Germanic or even PIE etymologies. Nor do they look Algonquin. Could the Carthaginians have sailed so far that we ought to look at at some proto-Semitic roots to see if that might help illuminate their derivation?

  183. They were famously secretive

    That’s just it. The fact that there’s no trace of a famously secretive people somewhere, proves that they were there.

  184. JW: I see. I got confused as to which of Punic and Proto-Semitic was the “serious” source, and which one was the “fun” one.

  185. Etienne: devastating response re Celtic/AA, but have you come across
    https://www.caitlingreen.org/2016/12/punic-names-britain.html?

    I still think it makes sense to be careful with bashing.

    1. The original question by maidhc was about Afro-Asiatic rather than Semitic.
    2. The original question was about plausibility, not about proofs.

    Etienne’s objection already apparently refers to Semitic (else I would love to know if Omotic if Afro-Asiatic: if we can easily recognize “AA” names, there should not be any problem in assigin a language).

    May be it is better to distinguish “AA” vs. “Semitic” and “plausible” vs. “proven” or there is a risk of (unintentional) drift: “Semitic is unproven => Semitic is implausible => AA is implausible”.

    With a conclusion that pre-Indo-European could not have arrived to Europe from the south-east with the Neolitic popularion, it surely must have come from a different direction, independently of the Neolithic population.

  186. J.W. Brewer says

    There are two steps in this sort of toponymic endeavor. Step One is to try to identify toponyms that were in use by Celtic-speakers upon the arrival of English-speakers that appear to be of pre-Celtic (or at least non-Celtic, which as shown below is not always the same thing) origin. Step Two is to try to figure out that origin, keeping in mind that there is no a priori reason to expect all such oddities in a particular region to be of the *same* origin or from the same time period. (E.g. because we know the relevant history it is pretty easy to figure out the non-Celtic etymology of “Gaelic” toponyms of Old Norse origin, even though the Norse came after the Gaels and then left again.) I’m not even entirely sure how easy or reliable Step One is, frankly. A claim that “*sarg” doesn’t have a plausible Insular-Celtic etymology we’ve thus far managed to figure out is not quite the same thing as a claim that “*sarg” e.g. just doesn’t fit the phonotactics of Insular Celtic, or only fits into a special phonotactic category stuffed with loanwords.

    I suppose I should add that in the other direction it’s possible that certain pre-Celtic toponyms went through an eggcornish transformation after centuries in the mouths of Celtic-speakers that made them Celtic-looking enough that Step One won’t detect them.

  187. David Eddyshaw says

    One of the few words that you can guarantee that all Welsh learners will remember, bwrw (as in “Mae hi’n bwrw sglodion”) comes from *CVrg-

    https://en.m.wiktionary.org/wiki/bwrw

  188. David L. Gold says

    Having read the rest of Caitlin Green’s proposed etymologies, I see that little would be added to the discussion because they are similar in nature to the ones on which several of us have commented above: all are remotely possible and none is immediately convincing.

    Were we a Scottish jury, we should return a verdict of Gun dearbhadh, No pruiven, or Not proven.

  189. John Cowan says

    you would need to be careful to define your criteria for markedness

    See Haspelmath 2006, a perfectly delightful takedown entitled “There are twelve different sense of markedness in linguistics, and we really have good alternative names for all of them that make any sense, so what the hell, people?” (Well, not really, but that’s what it should be called.) In the case of markedness #4, phonetic difficulty, there are 8 different symptoms of this: neutralization, typological implication, frequency, allophonic variation, phonemic differentiation, instability under assimilation, and appearance in epenthesis. Do all these line up reliably? Of course not.

  190. David Eddyshaw says

    I don’t think it’s a huge theoretical problem for the particular purpose I had in mind; you could do something as brutally simple as looking at the overall frequency of phonemes in each language in question and then seeing where the phonemes in 1st/2nd person pronouns fell in the league table for that language.

    Admittedly, you’d have to decide whether you meant overall frequency in a collection of representative texts, or in the lexicon; but comparing the two might well be interesting in its own right. (in Kusaal, you would find that /f/ was very common in texts, but not very common in the lexicon, because it occurs in few words, but they’re very common words.* Same with /ð/ in English.)

    * A fortiori /h/, which occurs as an unequivocal phoneme in exactly one word (leaving aside proper names of foreign origin and words like hee “hey!”): hali “until, as far as, even, very.” But hali is a very commonly used word indeed.

  191. Athel Cornish-Bowden says

    A bit like y as a Spanish vowel. I can only think of one word in which it is a vowel, namely y “and”, possibly the most common word in the language. I’m not counting proper names, like Ynez.

  192. But that’s just a question of writing; the discussion is about phonemes, and written y there represents /i/, a very common phoneme in Spanish.

  193. David Eddyshaw says

    Apparently /ɫ/ in Standard Arabic occurs exclusively in the name of God, الله /ʔaɫˈɫaːh/.

    https://en.wikipedia.org/wiki/Arabic_phonology

  194. A very thought-provoking paper on the topic of Semitic *l and Arabic /ɫ/ from Alice Faber, whose work is always essential reading:

    https://doi.org/10.2307/604335

    (A. Faber, “On the Nature of Proto-Semitic *l,” Journal of the American Oriental Society 109: 33-36, 1989.)

  195. David Eddyshaw says

    Interesting, and quite persuasive; her analysis would mean that Arabic الله was a conservative form which had resisted a change of /ɫ/ to /l/ everywhere else in the inherited lexicon.

    Amharic /l/ is not velarised; I have no information on other Ethiosemitic languages.

  196. Annette Pickles says

    The only credible explanation is an origin in the Proto-Semitic root *śrq, ‘redden; rise (as of the sun); east’, compare Modern Arabic šarq, ‘east’

    Nonsense! Sark is “pirate island”! ????‍☠️????‍☠️????‍☠️

    The root is Proto-Semitic *šrq “to steal” (Arabic sāriq “thief”, Akkadian *šerqu “stolen goods”, etc.). After all, Sark was the base for the medieval pirate Eustace the Monk, and piracy was an ongoing problem there until the middle of the 16th century, when Elizabeth I granted it to Helier de Carteret, Seigneur of St. Ouen, on the condition that he keep it free from pirates.

    This is a more attractive than an etymology from *śrq “to rise” , which loses much of its charm for Sark when you rewrite the root as *ɬrkʼ or * t͡ɬrkʼ, using the actual phonetic values of the consonants that most Semitists agree on. At least *šrq has PS (probably a simple [s] rather than [ʃ]), and it can be securely reconstructed for Proto-Semitic. For *śrq “rise”, no Akkadian cognate to the West Semitic forms is known, at least to me.

    Or maybe Sark is the “partner island” to Guernsey and a reflex of the root *śrk (West Semitic only too) that is seen in Arabic šarika “to share, partake, be a partner”, Ugaritic šrk “to team up with, join”, and Aramaic srk “to adhere, be attached”?

    Or was the original Sargia formed in early medieval times from Latin sargus “seabream, vel sim.” (Greek σαργός), like say, Sheepshead Bay in Brooklyn, after the sheepshead seabream Archosargus probatocephalus. (But not alas, like Latin Sardinia, not from sardīna, Greek σαρδῑ́́νη.)

  197. J.W. Brewer says

    I agree that *šrq is a more attractive proto-Semitic root. My only hesitation is that it’s hard to rule out the possibility that there’s an even more plausible Proto-Macro-Vasconic root that could have eventually evolved into “Sark,” what with Proto-Macro-Vasconic not being reconstructed very well.

  198. David Eddyshaw says

    Sark is (of course) “Love Island” (cf Welsh serch.) This probably is a consequence of it being the Island of شرك‎ /širk/, as Annette Pickles suggests above. A sort of Bali in the Channel …

    EDIT:

    Hah! I missed the undoubted true origin, what with everybody here thinking only of Semitic within Afroasiatic: the name is from the Hausa sarƙa “chain”, because the Channel Islands are an archipelago. It’s obvious in hindsight.

  199. David Marjanović says

    I don’t have time to comment on most of this thread today, so…

    A very thought-provoking paper on the topic of Semitic *l and Arabic /ɫ/ from Alice Faber, whose work is always essential reading:

    The argument for a “dark” *l is good enough, and I wholeheartedly endorse the call for phonetic pedantry in the interpretation of attested dead languages and in reconstruction. But I’m surprised the third explanation isn’t mentioned: that Allah is the god, */ʔalˈʔlaːh-|, shortened in some way or other from *|ʔal ʔiˈlaːh-|. The same process that turned the ejectives into “emphatics”, and that has turned Classical Arabic /r(V)ʔ/ into [rˤ] in most or all modern varieties (e.g. /raʔs/- “head” > /rˤas/, would have turned /lʔl/ into the one-word phoneme /lˤː/.

    Nonsense! Sark is “pirate island”! ????‍☠️????‍☠️????‍☠️

    I love it – Sark is a tax haven today.

    But Sargia from sargus seems like a slam-dunk, especially because I suppose it would explain the vowel of the French form, Sercq. …but wait, why isn’t it *Serge

  200. Annette Pickles says

    Sark is (of course) “Love Island” (cf Welsh serch.)

    Yes! I was thinking “Concubine Island”, specifically… Middle Breton serch “concubine”. Some chieftain on the mainland stashed his side piece there.

    Proto-Macro-Vasconic not being reconstructed very well

    J.W.Brewer, that’s actually the first place after Semitic I looked in making this labored little joke about methodology! Unfortunately, I only have a pdf of Trask and Wheeler’s Etymological Dictionary of Basque at hand, and the only thing I found in it to riff off of with s- or z- was sare “net” with saroi “sheep pen”. But hey…

  201. David Marjanović says

    I note my browser has stopped putting flags in italics.

  202. David Marjanović says

    “sheep pen”

    Shetland, Sark, whatever.

  203. Annette Pickles says

    the name is from the Hausa sarƙa “chain”, because the Channel Islands are an archipelago. It’s obvious in hindsight.

    Of course! ???? I’m so dense. Typical vice of mine, taking Proto-Semitic as basically Proto-Afroasiatic.

  204. But seriously, Hausa sarƙa brings us back to *šrq again. Lane has the following, something like “pillory, iron chains”, under srq:

    سَارِقَةٌ sing. of سَوَارِقُ, which signifies Collars by means of which the two hands are confined together to the neck, called also جَوَامِعُ, (O, K, TA,) of iron, attached to fetters or shackles. (TA.) ―And the pl., سَوَارِقُ, signifies also The adjuncts (زَوَائِد) in the catches (فَرَاش [q. v.]) of a lock. (Ibn-‘Abbád, O, K.)

    Is this the source of the Hausa? I feel like I am missing something in the relationship between the Arabic word and its root, like it’s a loanword or something.

  205. …but wait, why isn’t it *Serge…
    Because then it would be “love island”.

  206. David Eddyshaw says

    Is this the source of the Hausa?

    Pretty certainly, yes. Hausa has lots of Arabic loanwords, including for quite everyday things, and the phonology is exactly what you’d expect; Arabic /q/ is consistently rendered with the ejective ƙ.

    I agree that that intra-Arabic etymology looks rather improbable. Part of the familiar tendency to force verbal roots on hapless primary nouns that is traditional in Semitic etymology, perhaps …

    Not that it adds to the discussion, but the Hausa sárƙà: itself has been widely borrowed in Western Oti-Volta in the sense “prison”, as with Kusaal sāregá – a word that is interesting also in that its tones show that, like most loans from Hausa, it must have been borrowed before the development of word-internal H tone spreading in Agolle Kusaal. Now if only I had some way of dating the borrowing …

  207. “Concubine Island”

    Some chieftain on the mainland stashed his side piece there.

    “sheep pen”

    What are you saying?

  208. What happens in Sark stays in Sark.

  209. David Eddyshaw says

    Given that we now have demonstrated that the name of the island refers to concubines, sheep and chains, that is probably for the best.

  210. David L. Gold says

    @LH I automatically stop paying attention to people who say or write things like that. “The only credible explanation” is just a way to try to bully you into believing their hobbyhorse without evidence.

    If nature abhors a vacuum, so too do certain students of etymology: they cannot bring themselves to say “of unknown origin.”

    “I maintain, as Eric always did, that it is better to guess than to be silent” (Anthony Burgess on Eric Partridge’s etymologies; the passage is from Burgess’s The Ink Trade. Selected Journalism 1961-1993, Carcanet Press Ltd., 2013).

  211. J.W. Brewer says

    @D.L.G.: One key problem is the slippage between A, who thinks “better to guess than be silent,” and B, who then subsequently transmutes A’s guess into “the only credible explanation.”

  212. Sark is clearly from *Snark with loss of nasal caused by everyone having a cold.

  213. David Eddyshaw says

    The /n/ has softly and suddenly vanished away.

  214. “Mae hi’n bwrw sglodion”

    “Mae hi’n bwrw cwrw” might be preferable to students accustomed to rhyming by earlier “Dw i’n hoffi coffi.”

  215. David Eddyshaw says

    But that is unsuitable for schoolchildren. The proprieties must be observed.

  216. Good point, and they’re perhaps not drinking coffee either, though my expectations have not been updated for the Starbucks era and loose definitions of coffee.

  217. The gist of Haspelmath’s paper was:

    Linguists are too accustomed to talking in terms of “vanilla vs strawberry” when they really should be talking about “vanilla vs chocolate”, “vanilla vs neapolitan”, “vanilla vs mango”, “vanilla vs rum and raisin”, “vanilla vs mint”, and many others.

  218. But who did invent this scheme:
    1. raining verb has an argument
    2. adverbial pragmatically, nominal semantically, idk what syntactically
    3. the argument is demonstratively absurdic

    ?

    (Wikipedia has: mae hi’n bwrw hen wragedd a ffyn and ou vrouens met knopkieries reën)

  219. Maybe even “vanilla vs French vanilla”.

  220. How about “vanilla vs vanilla bean”?

  221. David Eddyshaw says

    @drasvi:

    A Brythonic substratum in Afrikaans! I knew it!

    “Old women and sticks” are much more likely than “cats and dogs”, at any rate. Also less likely to offend PETA.

  222. Mae hi’n bwrw dynion! Halleliwia!

  223. Mae hi’n bwrw sglodion So this would be understood by a Welsh speaker to mean “it’s raining chips” and not “she’s throwing chips”? Or is it ambiguous without context?

  224. Lars Mathiesen says

    In Danish, when there is heavy rain with large drops you used to say det regner skomagerdrenge, supposedly because it looks like the drops skip along when they hit the cobblestones, as shoemakers’ apprentices notionally do. We don’t have any really weird meteorological phenomena, though, at least not on the language side.

    It may even be in Hans Christian Andersen somewhere.

  225. The /n/ has softly and suddenly vanished away.

    The nightingale spirited off with it.

  226. For me, the canonical version of “The Nightingale” is the Fairy Tale Theatre episode starring Mick Jagger.

  227. John Cowan says

    The old Latin grammarians said that the highly defective verb pluit ‘it’s raining’ had the implicit subject Iūppiter!

    In Lojban, however, the subject is the precipitant, thus one says ‘snow rains’, ‘hail rains’, etc., and simply ‘rains’ for default precipitation. Saying “the rain rains” would be more like “that which rains, rains”, since Lojban nouns other than names are formed from arguments of verbs.

  228. David Eddyshaw says

    In Kusaal it’s a bit similar: you say Saa niid nɛ “Rain is raining.” However, I’m cheating a bit: although saa does mean “rain”, it has a broader semantic range, encompassing the sky itself when regarded as the source of cloud-related weather: Saa tansid nɛ “Rain is shouting” means “It’s thundering” and Saa ian’ad nɛ “Rain is jumping” means “There is lightning”; saa zug “on top of rain” means “sky.”

    Interestingly, you can’t say *Li niid nɛ “It’s raining”, even though meteorological “it” is fine in general, e.g. Li tʋl “It’s hot.” And you can say Li niid nɛ sakʋdʋg. “It’s raining a heavy rain.” (Sakʋdʋg is literally “old rain”; I don’t know how it got to be an idiom for “heavy rain.” Maybe it’s just an accidental homophony, and nothing to do with “old.”)

    [“It’s threatening to rain” is Saa kʋʋd nɛ, where the verb is exactly homophonous with the verb “kill”; sadly, comparative evidence shows unequivocally that this is not some picturesque metaphor of deep anthropological significance, but just a case of regular historical phonological changes causing two quite distinct verbs to fall together in sound.]

  229. David Eddyshaw says

    Come to think of it, the “threaten to rain” verb kʋ̄ may well be related to the kʋdʋg element of sakʋdʋg “heavy rain”; the corresponding Mooré verb kʋ́ɩ is glossed “of rain, gather in big quantity (clouds)” in Niggli’s dictionary. If it is, this kʋdʋg would actually be different tonally from “old”, though the difference would be neutralised in citation forms.

  230. I think one of the most interesting questions about precipitation verbs is whether they can be used with arbitrary complements. With Bartholomew Cubbins on my mind, I am reminded that the canonical example of something “different” raining down from the sky is oobleck (the literary original, not the cornstarch mixture). So the question becomes: Which languages can translate “raining oobleck” without semantic or pragmatic problems?

  231. John Cowan says

    Certainly Lojban has no trouble with ‘raining cats and dogs’ (lit. ‘cats and dogs rain’), and ‘hailing’ is likewise lit. ‘hail rains’, which unfortunately spoils the joke:

    Q: What’s worse than raining cats and dogs?
    A: Hailing taxis.

    Perfect puns are impossible in Lojban anyway, by design.

  232. In Russian, even water cannot rain. дождь/rain is a noun and the corresponding verb дождить means that the weather is in a generally rainy state, but doesn’t mean any particular drops of water are falling down. As an action, dozhd’ either “falls” or “goes” or “poors”, but it doesn’t rain.

  233. David Eddyshaw says

    “Cold”, as in weather, is similarly always a noun in Kusaal (and the rest of Western Oti-Volta): “It’s cold” is

    Waad bɛ. “Cold exists.”

    Which is an odd asymmetry with Li tʋl “It [i.e. the weather] is hot”, where tʋl is a predicative-adjectival verb. Language is so complicated.

  234. @David Eddyshaw, if it’s not too much trouble I am still curious whether Mae hi’n bwrw sglodion would be understood by a Welsh speaker only to mean “it’s raining chips” and not “she’s throwing chips”.

  235. Does Kusaal have separate verbs for more degrees of heat, e.g. ‘warm’, ‘cool’, ‘scorching’? ‘Humid’? If so, does it verb them, or are they nouns?

  236. David Eddyshaw says

    Mae hi’n bwrw sglodion

    I suppose it could mean “throwing chips”, but it would be an odd way of putting it. Apart from the meteorological use, bwrw turns up in quite a number of idioms not to do with literal throwing, and elsewhere tends to mean “throw out/away/down” or “beat.” The ordinary verb for “throw” is taflu.

  237. J.W. Brewer says

    Is “it’s raining oobleck” really more of a canonical example than “it’s raining men (hallelujah)”?

    Surely there’s a difference between (i) whether an arbitrary/invented noun can be slotted into a particular construction; and (ii) whether a well-known noun that wouldn’t ordinarily fit the literal semantics/pragmatics of the construction can be slotted in via poetic license?

  238. David Eddyshaw says

    Does Kusaal have separate verbs for more degrees of heat, e.g. ‘warm’, ‘cool’, ‘scorching’

    For “scorching” I think you’d just say “very hot”: Li tʋl hali.
    “It’s cool” is Li ma’as, with a verb.

    There are about sixty imperfective-only verbs in Kusaal, which denote stances, relationships or (as here) predicative adjective senses. Either Kusaal preserves this conjugation better than any other Western Oti-Volta language, or the grammatical descriptions of other languages have just missed the fact that they are a distinct morphological category; unfortunately most of the extant grammars really aren’t all that complete or reliable, and for several languages there’s not even that much, so that is all too possible.

    However, there is a good, though short, grammar of Farefare by the late Prof Esther Kropp Dakubu, and a reasonable one of a slightly different dialect of the same language by Urs Niggli which between them seem to show that Farefare has preserved the group much less well than Kusaal, despite being pretty conservative in other respects (preserves the whole grammatical gender system, and has two formally distinct imperfective flexions in the main verb conjugation, both features preserved elsewhere only in the highly divergent language Boulba, which has fallen among Eastern Oti-Volta languages in Benin and lost its Western Oti-Volta innocence.)

    Not all Kusaal adjectives have corresponding predicative verbs by any means; none of the colour adjectives do, for example. I don’t know if these are just accidental gaps or whether there is some underlying principle at work. The existing predicative-adjective verbs are used all the time; there doesn’t seem to be any indication that the construction is obsolescent, even through there are other ways of expressing that meaning using the copula verb (in fact, Kusaal adjectives can only head noun phrases when they are used as predicative complements.)

    The much more remotely related Oti-Volta language Nawdm has a great many verbs (hundreds, at least) corresponding exactly in formation to this Kusaal conjugation (once you recognise the relevant consonant correspondences, which AFAIK have not been described in the published literature to date.) It seems clear that its relatively well represented status in Kusaal is due to preservation rather than innovation, at any rate.

  239. How do you pronounce Nawdm?

  240. David Eddyshaw says

    /naʊdəm/

    This follows the standard orthography of the language itself, which omits ə often rather confusingly, including word-finally where it’s tone-bearing (e.g. gweedmb [ɡɥèːdə̀mbə́] “sell”, gerund.)

    Moba orthography does the same, but not the standard orthography of Bimoba, which is either a very closely related sister language or another dialect of the same language, depending on whether there is an “r” in the month. Bimoba, spoken on the Ghana side of the border, seems to have adopted pretty similar orthographic conventions to its neighbour Kusaal. It makes the languages/dialects look misleadingly different when written.

    At one stage I actually used an orthography rather like this for Kusaal in my own work, but decided in the end it was preferable for all sorts of reasons to keep as close to the existing standard orthography as possible consistent with marking all contrasts properly. The orthography of my grammar is basically the same as the 2016 Bible with added diacritics where necessary, except for word division, where the existing conventions are extremely misleading. (And one extra vowel symbol: I still don’t understand why they added a sign for /ʊ/ in the 2016 reform but not for /ɪ/.)

  241. David Marjanović says

    The old Latin grammarians said that the highly defective verb pluit ‘it’s raining’ had the implicit subject Iūppiter!

    Still in my schoolbook (late 90s): “(actually ‘he lets it rain/has it rain’, sc. Iupiter)”.

    Either vowel length or consonant length, BTW. Here is a thorough investigation of the littera rule, which turns out to have an analog in Kölsch, and the narro rule, which turns out to be unrelated to the littera rule.

  242. David Eddyshaw says

    Regarding SOV order in Niger-Congo, I have rather belatedly discovered that all four of the Eastern Oti-Volta languages, Nateni, Ditammari, Byali and Waama, put personal-pronoun objects (even indirect objects) immediately before the verb, though they are otherwise SVO.

    These languages are certainly part of a Sprachbund in the northwestern Atakora département of Benin, which confuses the issue, but the Western Oti-Volta language Boulba, which has wandered into the area and shows many features of the Sprachbund, doesn’t show this one.

    On the other hand, the Eastern Oti-Volta languages are not particularly close to one another once you factor out the Sprachbund features, apart from Nateni and Ditammari, which clearly form a real subgroup together.

    The local trade language in northern Benin is Dendi, which is a Songhay language, and hence presumably SOV; it seems to be essentially a Zarma dialect, but WP says it’s “heavily influenced” by Bariba, which is SOV, and moreover dominates the eastern half of northern Benin. So areal influences are by no means out of the question, though if that’s the explanation I don’t see why only pronoun objects would have been affected.

    To complicate matters further, Oti-Volta languages have a number of typological features that tend to go with SOV, like postpositions, and possessor-possessed order in noun phrases. It wouldn’t be astonishing in retrospect if it were shown that the protolanguage was once SOV. I suppose the fact that the group has noun class suffixes rather than prefixes (like Bantu and “Kwa”) might be taken to go with that too, though that is not a feature confined to “Gur.”

  243. David Eddyshaw says

    all four of the Eastern Oti-Volta languages

    All five of the Eastern Oti-Volta languages. I just discovered a very nice available-for-download grammar of Mbèlimè by Lukas Neukom. I’d previously had no information on that language at all. I’d more or less assumed it was a Ditammari dialect, but it clearly isn’t, now I’ve got a decent description to look at.

    https://www.comparativelinguistics.uzh.ch/dam/jcr:00000000-17e0-420f-ffff-ffffd5260443/ASAS_18_Neukom_2004_Mbelime.pdf

    It’s good on tone, which pleases me no end. I’ve been trying to get more of a handle on the development of tone in Oti-Volta and finding myself much handicapped by lack of good data from the Eastern languages. The Mbèlimè system seems rather like Gurma, which is more or less what I would have expected on the basis of my understanding so far, viz that WOV-Yom/Nawdm-Buli/Konni share a common major tone-system innovation, so you’d expect Everything Else to be more or less similar. But I won’t really be sure until I’ve figured out how Oti-Volta in general ended up with three-way tone contrasts instead of the Volta-Congo industry-standard two.

  244. That’s a great find — lucky you!

Speak Your Mind

*