Elongated Consonants Mark Words?

Dmitry Pruss sent me a link to “Consonant lengthening marks the beginning of words across a diverse sample of languages” (open access) by Frederic Blum, Ludger Paschen, Robert Forkel, Susanne Fuchs, and Frank Seifart (Nature Human Behaviour [2024]); the abstract:

Speech consists of a continuous stream of acoustic signals, yet humans can segment words and other constituents from each other with astonishing precision. The acoustic properties that support this process are not well understood and remain understudied for the vast majority of the world’s languages, in particular regarding their potential variation. Here we report cross-linguistic evidence for the lengthening of word-initial consonants across a typologically diverse sample of 51 languages. Using Bayesian multilevel regression, we find that on average, word-initial consonants are about 13 ms longer than word-medial consonants. The cross-linguistic distribution of the effect indicates that despite individual differences in the phonology of the sampled languages, the lengthening of word-initial consonants is a widespread strategy to mark the onset of words in the continuous acoustic signal of human speech. These findings may be crucial for a better understanding of the incremental processing of speech and speech segmentation.

It sounds plausible; I wonder what the assembled Hatters think. (Thanks, Dmitry!)

Comments

  1. David Eddyshaw says

    Or to put it another way, non-initial consonants tend to be shorter.
    While (as I have often remarked) I am no Ladefoged, this strikes me as unsurprising as a general tendency: it seems to be much the same as saying that non-initial consonants are prone to “lenition”, which is certainly common enough.

    However, the authors seem to be unaware that “word” is very difficult to define cross-linguistically, and the criteria used in different languages by perfectly competent descriptive linguists don’t actually align very well. (This comes up a lot in “Niger-Congo”. where one linguist’s proclitics are often another linguist’s prefixes.)

    “Phonological word” is often different from “morphological word” even within a single language.

  2. David Eddyshaw says

    Levinson’s nice grammar of Yélî Dnye actually uses root-initial lenition as a criterion for word demarcation. I think that this sort of thing is pretty common in descriptions of less common languages, and it makes me wonder how much of this study is actually circular: just retrieving the word-demarcation criteria that the sources came up with.

    (The words for the numbers 2-9 in Oti-Volta are unique in that they always occur with flexional class agreement prefixes; in all other cases, class affixes are suffixes. It was only recently that I realised that this is why the reflex of initial POV *d in the numbers in Western Oti-Volta is /j/, not /d/: this was the regular development of non-initial *d in WOV, so e.g. Mooré (a)yòobé “six” corresponds to Mbelime dúò “six” just like Mooré lʋ̀ɩ “fall” corresponds to Mbelime dī.)

  3. Speech consists of a continuous stream of acoustic signals, yet humans can segment words and other constituents from each other with astonishing precision.

    “Humans” in general can do no such thing. Even one speaker of language A can have difficulties understanding a second speaker of A, when two different dialects are involved. Proof: me trying to understand street-level “French”, or Andalusian or Argentinian “Spanish”, all spoken at 1% the speed of light.

    The way I manage is by not trying to segment words. I allow swathes of sounds to trigger recognition – parts of phrases or expressions, in hindsight, with unclear boundaries. There’s just not time enough to fuss over individual words or “where they start”.

    I can follow an hours-long radio discussion by Andalusian/Argentinian physicists about galaxy formation, without being able to repeat exactly any given sentence. But then I’m not sure I have understood everything exactly. On the whole, though, I feel I have understood enough.

    I have a similar experience when reading the nonsense screeds some developers write in defense of bad code I have flagged in a code review. Here there is no problem determining word boundaries. In fact in order to figure out what they’re trying to say I have to ignore what they actually say.

    I don’t doubt that word-initial scrutineers have something of value to contribute. A widow’s mite.

  4. “Humans” in general can do no such thing.

    I think that there might be an issue of context here. One of the things that always “amazes” linguists is how children manage to make sense of an undifferentiated stream of language, to pick out individual sounds in a literal deluge of meaningless noise. And even how they manage to make sense of different varieties of language — although this might be a separate skill acquired after learning one variety (not sure of this — presumably children pick up their parents’ variety first and other uitlandisch dialects later).

  5. ‘always “amazes” linguists’ – fortunately, I’m not a linguist.

  6. David Eddyshaw says

    Ah, but it is the capacity to be amazed by what others take for granted that makes a proper scientist.

  7. Yes, Chomsky never ceased to be amazed that his niece could learn language but neither his cat nor a rock could. And he never stopped telling people about it.

    But now we have the answer. Neither cats nor rocks can perpetrate MERGE, whereas his niece obviously could.

    Aspersions aside, I agree with your observation. We all know that apples fall from trees, but only a scientist could ask “Why?”

  8. “Why?” said jesting Pilate, and would not stay for an answer. Apparently scientists not only should ask “why”, but also stick around to hear any answers.

  9. PlasticPaddy says

    These are two kinds of ‘why’; the phenomenological why, which is the domain of science and the ethical or teleological why, which are poorly served by scientific methods.

  10. I think you may have missed the point of my deliberate misquotation. What Pilate asked in Bacon’s essay was “What is truth?” Last I heard, “science” pursues truths – albeit truths of a provisional and best-effort kind, constrained not by ethical scruples but by limited research grants.

    I get the impression from current pundits of science that “why” is out of fashion. “How” is the approved look.

  11. Jen in Edinburgh says

    I am regularly amazed by things that I took for granted, but I am not a proper scientist in spite of that.

  12. PlasticPaddy says

    @stu

    Pilate then went back inside the palace, summoned Jesus and asked him, “Are you the king of the Jews?”…

    Jesus said, “My kingdom is not of this world….
    “You are a king, then!” said Pilate.
    Jesus answered, “You say that I am a king. In fact, the reason I was born and came into the world is to testify to the truth. Everyone on the side of truth listens to me.”
    “What is truth?” retorted Pilate. With this he went out again to the Jews gathered there and said, “I find no basis for a charge against him.

    The context of Pilate’s question (whatever Bacon’s essay was trying to prove with it) seems to be
    “There is no law against claiming to be the king of the truth or of some self-selected body that claims to be on the side of truth, whatever you or they mean by that. Now let me eat my dinner in peace.”

    So I am not sure where “why” comes into it.

  13. My comment was a snarky follow-up to the immediately preceding one by Bathrobe, ending: “We all know that apples fall from trees, but only a scientist could ask “Why?””

    That’s where “why” comes into it.

    Scientists today prefer to scrump the apples, instead of sitting around fretting over quem ad finem on an empty stomach.

    whatever Bacon’s essay was trying to prove with it

    He wasn’t trying to prove anything, see the first essay “Of Truth” here.

  14. @ Stu, thanks for this link. [I happen to have read a lot of Thomas Browne because someone gave me a book but I had never read Bacon.] :

    “One of the later schools of the Grecians examineth the matter, and is at a stand to think what should be in it that men should love lies; where neither they make for pleasure, as with poets; nor for advantage, as with the merchant, but for the lie’s sake….

    [not to mention politicians]

    But howsoever thse things are thus in men’s depraved judgments and affections, yet truth, which only doth judge itself, teacheth that the inquiry of truth, which is the love-making, or wooing of it, the knowledge of truth, which is the presence of it, and the belief of truth, which is the enjoying of it, is the sovereign good of human nature.”

  15. I sometimes seriously think that the 17th century was the tip-top peak of English prose.

  16. downhill ever since…

  17. David Marjanović says

    13 ms is really not much. That’s shorter than aspiration; it’s like a fortis release vs. a lenis release, AFAIK.

    Still, of course, it’s well known that languages with phonemic consonant length usually lack it in word-initial position* and treat word-initial consonants, even though they’re phonetically (rather) short, as long when they apply sound shifts or whatever. But maybe that’s just common resistance to intervocalic-and-such lenition phenomena for two different reasons.

    No time to read the paper – did the authors examine any rigorously prefixing end-stressed languages? …Are there even any?

    * Swiss German is the great big exception. “Geminates all over the word”…

  18. David Eddyshaw says

    Kusaal only has non-initial voiceless plosives as geminates (secondarily degeminated whenever they are left word-final by the short-final-vowel apocope thing.)

    So you actually could make some sort of case for regarding the initial voiceless plosives as being geminates as well, thereby making voice non-contrastive.

    Welsh is quite similar.

  19. To clarify (from the methods section): stops were excluded, as were phonological geminates, and (for technical reasons) sounds shorter than 30 ms.
    I learned from this article about Menzerath’s Law, another universal tendency.

  20. Mbelime dúò “six”
    Clearly a case of inflation.

  21. David Eddyshaw says

    It is a confusion, arising from the old Royal Navy exhortation “two-six”, meaning “all together now, pull/heave!”

  22. David Eddyshaw says

    Menzerath’s Law

    Kusaal has a fairly frequent alternation of CVC with CV:C in roots, where the CV:C allomorph is never found before any derivational suffixes, i.e. it only appears in root-stems.

    Mind you, that’s come about through specific sound changes in proto-Western-Oti-Volta: in verbs, CVC roots ending in alveolars or labials were lengthened to CV:C before a now-lost derivational suffix*, and it was that suffix which was dropped before a second derivational suffix. WOV does that a lot: stems which comparative evidence suggests originally had two derivational suffixes have dropped one or the other. That seems to be a rather different process from what Menzerath had in mind.

    * The “separative-reversive” suffix; it has cognates in Bantu, and everything.

  23. Did Menzerath talk at all about process, or just about its synchronic results?
    His Law says that statistically, longer words contain shorter bits. So, is it that the bits shorten once they’ve accreted to a longer unit? Or do bits become stickier once they’ve shortened, as in content words grammaticalizing, then becoming clitics, then affixes? (-ish; pace Haspelmath.)

  24. David Eddyshaw says

    As far as I can tell (based entirely on WP), Menzerath was indeed just talking about synchronic patterns.

    But while that is doubtless all very interesting in a cor-fancy-that kind of way, it seems a bit timid just to leave it at that: any such regularity cries out for some sort of explanation.

    My motive for citing the Kusaal case (not that I need much in the way of a pretext, admittedly) was to suggest that this sort of pattern is at least sometimes the result of a “conspiracy” of historical changes. Kusaal has almost no unprefixed stems longer than four morae (except in loanwords) and all native four-mora stems end either in derivational *m or *d [and the ones in *d look suspiciously like analogical recreations which have supplanted earlier shorter stems]: other patterns seem to have been ruled out by a quite disparate set of rules which collectively work to limit stem length, like the dropping of one suffix out of most pairs of derivational suffixes.

    Sievers’ actual Law seems to be another instance of an identifiable historical sound change leading to Menzerathicity.

    https://en.wikipedia.org/wiki/Sievers's_law

    I suppose that the right way to look at this might be to turn it around and say that, cross-linguistically, even regular Neogrammarian-pleasing sound changes are more likely to come about if they have a tendency to create forms that are more Menzerathic.

  25. David Eddyshaw says

    did the authors examine any rigorously prefixing end-stressed languages? …Are there even any?

    There are some Bantu languages which have lost the proto-Bantu final vowels, but I can’t think of any where stress is contrastive. (Stress tends to get fairly short shrift compared with tone in descriptions of Niger-Congo languages, though. I wouldn’t be at all surprised if there were Northwest Bantu languages which fit the bill. If not in Bantu, surely somewhere in Niger-Congo. The Gurma language Akaselem has secondarily developed class prefixes, while dropping almost all original final syllables, but I have no data on stress patterns there either.)

Speak Your Mind

*