Genbun Itchi.

Matt’s latest post at No-sword brings to my attention the Japanese equivalent to the attempts to reconcile katharevousa and demotic Greek, genbun itchi:

Written Japanese, fundamentally standardized by the eighth century, had undergone sporadic and incremental change prior to the Meiji period, evolving into a collection of documentary, epistolary, and narrative styles that were firmly bound in the classical language. The spoken language, on the other hand, which had developed considerably over the centuries, reflected the multiple dialects and complex hierarchies of contemporary Japan. The disparity between writing and speech caused great concern for Meiji leaders, both because learning the written language took a great deal of time and effort and because it was a barrier to mass literacy. Although Tokugawa literature contained examples of colloquial dialogue, writers and scholars sought a narrative style that was closer to speech yet flexible enough to be used in formal contexts.

Futabatei Shimei is generally credited with the first successful use of a vernacular style in his novel Ukigumo (1887; tr. The Drifting Clouds, 1967). However, Futabatei gives credit for his model of colloquial narrative to rakugo storyteller San’yutei Encho, whose collaboration with Takusari Koki (inventor of sokki, Japanese shorthand) allowed rakugo stories to be published in newspapers. Other writers quickly joined the so-called genbun itchi (unification of writing and speech) movement, to which there was opposition through the 1910s. Publishing houses adopted the new style in their children’s literary journals, such as Akai Tori, and other massmarketed publications, which led to its widespread adoption. The use of classical written styles continued among some authors, however, for several decades.

With that background, Matt focuses on “the mass of past and/or perfective verb endings”:

That particular part of Japanese reached its peak of complexity during Early Middle Japanese; it’s been a downhill slope of simplification ever since, and today we’re basically down to the –ta ending. But because Early Middle Japanese also served as the model for Classical Japanese, as the centuries rolled on the literary community were expected to master and preserve fine distinctions of a sort that their native language clear-felled and paved over increasingly far back in the mists of history.

He quotes “a marvelous rant on this topic by Ochiai Naofumi 落合直文, from an essay published in 1890 called ‘Shōrai no kokugo’ 将来の国語 (‘The national language/Japanese of the future’),” which I highly recommend. Peevery is ubiquitous and eternal!


  1. David Eddyshaw says:

    There’s a parallel to this in classical Sanskrit, where the old imperfect, perfect and aorist forms are all just used as undifferentiated past tenses. Recherché aorists and the like get put in just to show how well the author has boned up on his Sanskrit morphology. As far as I know there is no tradition of peevery about it, though.

  2. An excellent parallel, thanks!

  3. My guess would be that it just so happens that no Sanskrit peevology has been preserved, not that there was none.

    The switch from Classical Chinese to written Mandarin is of course a very close analogy, and was happening at just about the same time. Other cases are the preservation of Old English “AB dialect” into the Early Middle English period, as well as (in the opposite direction) the probable use of the archaic simplicities of Primitive Irish in ogham as opposed to the complexities of Old Irish in manuscript.

  4. “Other cases are the preservation of Old English “AB dialect” into the Early Middle English period,”

    I saw a paper which I cannot for the life of me find of even remember the title of, that pointed out that AS was so static for the entire 600-year period of its use that it had to have become similarly fossilized as a written standard well before the Conquest. I wish I could find it.

  5. Trond Engen says:

    I don’t read Ochiai Naofumi’s rant (as quoted) as peevery at all, and hardly as a rant. To me it reads as a light-hearted comment on the obsolescence of past pasts.

  6. David Marjanović says:

    AS was so static for the entire 600-year period of its use

    Maybe not at the very beginning. There was a separate rune for, well, ö, but even the oldest Latin-alphabet sources didn’t bother and used e; these oldest English writings in the Latin alphabet retain h between vowels, which is gone from all later ones.

  7. Thanks for the link!

    Trond: It is fairly light-hearted, but still, when in the first paragraph he insists that each ending should be used in its correct context, I think it’s fair to call it peevery. It seems clear that he thought of the then-current situation as erroneous and degenerate, although it’s unclear from the passage quoted how much of the blame for this he placed on individual writers. (e.g. maybe he thought that things were so bad that it was no longer reasonable to expect people to learn the language “correctly”).

  8. Is it really peevery, though?

    Or if it is, isn’t it different than what we usually mean by the term? At least it seems to me there’s a distinction to be made–and not purely of degree–between that guy who interrupts even everyday casual speech to correct you (A: “He don’t—” B: “Doesn’t!!”), and the defenders of orthodoxy in a legacy written register. I’m sure there’s some overlap of sympathies, but I’d wager that not a lot of those card-carrying keri partisans went around actually using keri in daily speech, or demanding those around them do the same.

    There’s a fundamental difference between bullying those who don’t conform to the idiosyncrasies of one’s own speech community, and teachers training students to write in a curated form of language that no one living speaks anymore–though as a practical matter I grant there’s often some alignment between dominant communities and access to the educational institutions that provide said training.

  9. David, maybe you can help me find that paper. The writer was a German linguist working either at Leipzig or some similar institution, and I just cannot remember her name.

    Her thesis was that many of the changes that mark the transition between OE and ME were already in progress in the spoken language from the time of the invasions on, and simply not reflected in written sources, so that their apparent sudden appearance in ME was really not so sudden after all. Specifically she locates changes in the VP – development of the progressive, use of “do’ as an auxiliary – in the southwest as a feature of Celtic contact, and changes in the NP – loss of gender on nouns and adjectives – in inland Yorkshire as a feature of contact with Norse in a population that may only have transitioned to any form of English in the 9th century. It sounds plausible on historical grounds at least.

  10. Stefan Holm says:

    The written language is a tricky source to the spoken one. Who wrote anything in the good old days (prior to Internet)? Educated people, who had read what former educated people had written! Look at modern English – how many centuries haven’t passed since high was pronounced (more or less) as the name of last year’s Nobel Prize laureate in physics (without the final ‘s’)?

    My guess is that Old→Middle→Modern English is in all essentials due to the Norman invasion. After all (in spite of e.g. massive Low German influence on Scandinavian) the prosodic and phonetic differences between the modern Gmc languages are minor – with the exception of English, the only one exposed to 400 years of French speaking supremacy.

  11. David Eddyshaw says:

    Modern English resembles German prosodically a good deal more than it resembles French. That’s why Germans (and indeed native speakers of other Germanic languages) usually find it a lot easier to acquire a good English accent than French speakers do.

    The word corresponding to “high” has never sounded like “higg” at any stage in the development of English. The ‘gh’ represented the sound of German “ch” when standard English still possessed that sound.

    According to Randolph Quirk’s grammar of Old English, changes like the loss of short vowel quality distinctions in unstressed final syllables and the change of final -m to -n were complete well before the Norman conquest; the spellings in our Old English texts were already historical by the end of the Old English period.

    The striking simplification of morphology in English compared with, say, German, is in some respects evident even in Old English (no distinction of person in plural verb endings, for example.) In the later period it seems to have proceeded faster in Northern dialects, which might mean that if the languages of foreign invaders had any bearing on this at all, Norse is perhaps a more likely culprit than French.

    The English of Chaucer’s (or even of Shakespeare’s) day would have sounded a lot less unGermanic, I suspect. It’s the Great Vowel Shift and wholesale diphthongisation of perfectly innocent long vowels that makes modern English sound so odd (and screwed up the spelling, of course.) I don’t think the French can be blamed for that.

  12. David Eddyshaw says:

    Incidentally, Welsh, even in the earliest records, is considerably simpler than Irish morphologically. Perhaps it’s the British climate? Or something in the water …

  13. Jim, are you thinking of Hildegard Tristram? She wrote a paper called Why don’t the English speak Welsh?

  14. David Eddyshaw says:

    The development of the progressive in English can hardly be due to Celtic influence. Welsh has never actually made the distinction that English does between “I am singing” and “I sing.” In modern Welsh the same form is used for both:

    Wy’n canu (mutatis mutandis for dialect and idiolect)

    This is periphrastic, like “I am singing”, but the actual meaning covers *both* of the English types. In older Welsh the form was synthetic


    and similarly doesn’t distinguish. (The modern reflex of this, “cana i” is used for the future.)

    This notion that English owes its periphrastic verb forms to Brythonic is a real hardy perennial. It seems mostly to be put forward by people who haven’t paid attention to the fact that although Brythonic does indeed like periphrastic verb forms they aren’t much like the English ones at all in reality.

    Devotees of this idea also seem always to handwave the fact that the characteristic English use of “‘do” (eg compulsory in the negative and interrogative, optional in the positive, in which case it has an emphatic sense – none of which has any parallel in Brythonic) becomes prevalent long after it’s at all likely that there actually were any bilingual communities in a position to induce a major change in the whole language. Just what *other* evidence is there that the dialect of the Southwest played a dominant part in forming Middle English? (And if these off-the-radar Brythonic speakers were there all along in the heart of England, but the priggish racist Saxon scribes just ignored their speech because they somehow knew that these debased syntactic structures were unworthy, why are there so very few Celtic loanwords in English?)

  15. David Eddyshaw says:

    Thanks, Y – you posted that while I was in the white heat of composition still. It looks like it does indeed acknowledge my areas of disbelief. Not sure I find the arguments very compelling, on a very superficial glance, but it looks worth reading properly.

  16. David Eddyshaw says:

    The paper’s explanation for the time-gap seems to be that written Old English was an artificial elite language which was deliberately maintained in a Germanicised form, while non-elite speakers were already using periphrastic verb forms with “be” and “do”; these surfaced in writing after the Norman conquest because that event disrupted the old elite culture and allowed constructions which had in fact been there all along in the spoken language to see the light of day in writing. (This would certainly help obviate the need for mysterious long-surviving undocumented populations of bilinguals. I don’t think it altogether abolishes the gap.)

    The absence of Celtic loanwords is explained by appeals to creole studies, I think basically saying that Brythonic was so socially stigmatised that vocabulary was not transmitted; syntax, of which people are much less consciously aware, was not affected to the same degree. (I can think of a parallel to this scenario, in the Vaupes in Amazonia, with its compulsory linguistic exogamy, where language mixing is/was strongly discountenanced, but this in practice meant vocabulary, and the languages have converged a lot structurally.)

    The fact that the English do-constructions actually don’t match Brythonic periphrastic constructions very well is sidestepped by suggesting that it was essentially just the practice of periphrasis that was borrowed, and subsequently adapted within English, as for example to make aspectual distinctions which don’t exist in Brythonic. I’m not altogether clear from the paper if the author actually realises that Brythonic doesn’t make the present continuous/plain distinction.

    This a more subtle set of arguments than I thought. However, I think numerous aspects don’t really stand up too well, and there’s quite a bit of petitio principii going on. The argument that the loss of flexion in nouns in English is due to the absence of declension in Brythonic seems especially farfetched. It was jolly clever of all those Old English scribes to maintain a four-case system in their written English for all those centuries when their spoken language lacked case altogether …

    The insular Celtic languages, incidentally, are very unusual syntactically for Indoeuropean, being VSO, conjugating prepositions and all. Why didn’t some other parts of all this syntactic weirdness get transmitted to English given this proposed scenario?

    I must admit I’m allergic to “substratum” theories in general. It’s a very easy game to play, and few of them seem to attain the kind of rigour which could lead to any sort of independent verification or confutation. If it had so chanced that Ancient Egyptian had been spoken in Britain before the Germanic invasion, it would be simplicity itself to point to all the periphrastic Egyptian verb forms with “do” and “be” and complements formed with prepositions, and draw out their influence on English.

    It *is* a mystery that (according to most, anyhow) Brythonic has left so little mark on English given that the genetic evidence is that there never was any huge movement of Germanic invaders to Britain, contrary to the Our Island Story mythology. So there’s plenty of good motivation for this sort of speculation, whatever one’s opinion.

  17. @ David – some form of “do” periphrasis is very well established in almost all Germanic dialects, including Dutch dialects (although highly stigmatized in standard Hochdeutsch). Isn’t it more logical to assume that proto-Germanic already had some element of “do” periphrasis than to run to Brythonic models? I feel like this has been adressed in some thread here before.

    As to the mystery of lack of substratum, is there any possibillity that people inhabiting South-eastern Britain when the Anglo-Saxons showed up were already speaking a Germanic language?

  18. David Eddyshaw says:


    The fans of the theory that English “do” periphrasis has something to do with Brythonic claim that such constructions are cross-linguistically uncommon; as you imply, this is not a terribly convincing thesis. The most convincing and heavyweight name I’ve seen attached to these ideas is that of John McWhorter, who is a pretty eminent creolist and entitled to considerable respect for his views accordingly; however what I’ve seen of his arguments seems to greatly overestimate the similarity between the English and Brythonic constructions, including statements about the Brythonic side which are just plain wrong. I do wonder, too, if when you’re an eminent creolist you tend to see creolisation phenomena in places which are not apparent to ordinary mortals. If all you have is a hammer …

    I believe that there are indeed fringe theorists who claim that at least some of the peoples of Britain in Roman times spoke Germanic rather than Celtic languages. I inadvertently provoked one such person into commenting at great length during a thread on this subject on The Economist’s Johnson blog. It seemed to go with a sort of blood-and-soil racism, and had the usual features of ignoring all the mainstream work on the subject. It’s hard to know where to start with this sort of thing, but at the very least it would seem odd that so many of the Britons in Roman times had identifiably Celtic names if they were all running off home to speak proto-Old English to each other.

  19. David Eddyshaw says:

    Of course Celtic languages themselves are scarcely the Ursprache of the British Isles. It’s not clear when Celtic languages did come to Britain, but it’s actually possible that it was not even all that many centuries before Caesar. So I suppose if you were really determined you could claim that Vortigern and Boudicca and the rest were just a Celtic ruling class over a mass of downtrodden Germanic peasants, who went on to be liberated by their cousins from across the North Sea after the Romans left. Or you could claim that they were lizard people (and they’re still in charge.)

    Talking of substrata, the odd structure of the Insular Celtic languages has spawned many a theory that this could be due to the effects of a pre-Celtic substratum who spoke some VSO language, often then imagined to be Afroasiatic. Given the wholly hypothetical nature of the substrate this gets even more fanciful and prone to circular argument than the Brythonic-underlying-English thing. It also falls foul of timescale problems – Old Irish, for example, shows traces of an older SOV order, and many of the peculiar features shared by Goidelic and Brythonic, like the system of “mutation” of initial consonants, cannot go back to a shared protolanguage but must have arisen separately in each branch, and hence comparatively late. (The initial mutation thing really is strange enough that it beggars belief that its existence in Goidelic and Brythonic is just coincidence – as it can’t go back to the protolanguage it presumably shows that there must have been a great deal of intimate language contact in Britain itself between Goidelic and Brythonic after they split, which is not too easy to understand historically.)

  20. Another question with regards to the (existence or lack of) influence of Celtic on English is whether English actually replaced Brythonic or perhaps Britanno-Romance in the early settlement Areas; the latter would explain the lack of Celtic loanwords. But as far as I remember, the Romance loanwords in Anglo-Saxon are about the same that can be found in Continental Germanic, so seem to already have been brought over from the continent.

    As for explaining the different developments with regards to the retention of Celtic in the early and the later settlement areas, you probably already know “Apartheid and Economics in Anglo-Saxon England”? (Those who haven’t read it – don’t be deterred by the placative word “apartheid”, Mr. Woolf talks about legal inequality, not South-Africa style segregation).

  21. Stefan Holm says:

    I share your general skepticism, David, against the ever popular ideas of substratum influence. But I don’t believe that superstratum influence can be ignored. After all French was spoken at the royal English court and among the aristocrats for several hundred years. To their subjects they though had to speak (Chaucerian?) Englisc. This must have been influenced by foreign (French) phonetics and with a lack of knowledge about the inflectional system of Englisc. It must have had an influence on the development of English.

    A parallel could be found in Scandinavia: The main part of the transformation from old Swedish to modern took place during the 14th c. The trigger was undoubtedly the Middle Low German (Platdeutsch) spoken by the merchants of the Hanseatic League. So strong was their influence, that it had to be stated in the law, that a majority of the members of a Swedish city council must be Swedes.

    Being sharp businessmen the Hansa traders spoke (old) Swedish with their partners but then of course a poor variety mixed with Low German words, mismatched inflections and alien phonetics. This in turn was imitated by young generations of wannabees, who fancied the elite and thought of its distorted Swedish as the, should one say, “cool” way to speak.

    A linguistic thunderstorm swept over Scandinavia: a countless number of loanwords were adopted, new pre- and suffixes pushed old ones aside, the inflection system collapsed etc. As a result modern mainland Scandinavians can hardly read their ancestors’ rune stones, while the Icelanders face less problems in doing so (Icelandic was practically unaffected by the Hansa language).

    Swedish author August Strindberg (1849-1912) in his novel Röda rummet (The Red Room) lets one of his characters cry out in anger to the Scandinavian upper classes: Your language is nothing but Platdeutsch in twelve dialects!. He did so in response to the chauvinist ideas of the 19th c., expressed e.g. by poet Esaias Tegnér (1782-1846) in his Språken (The Languages). In this poem he makes comments upon a bunch of European languages and despises them one after another (English, by the way, he called a language for the stuttering made). Finally he reaches Swedish with the following two lines as the poem’s climax:

    Ärans och hjältarnas språk,
    Hur ädelt och manligt Du rör Dig!

    (Language of honour and heroes,
    How noble and virile Thou move!)

    Now it’s a joke among Swedish linguists, that those two bombastic lines don’t contain one single inherited main word – they’re all Low German loans!

    Erh…, now I see, that my poor attempt to present an English translation ended up the same way – with no Englisc main words. They’re all in the tongue of, what Sir Cedric in Ivanhoe called, the “Norman dogs” (for the purpose I deliberately choose “virile” instead of “manly”).

    Thus both English and (east) Scandinavian bear witness of one feature: a non-native and popular elite speaking to their subjects in the latters’ own language is likely to change it dramatically.

    From the same (or rather reverse) reason no radical changes occur in modern Swedish. The 15-20 percent or so 1st or 2nd generation immigrants in the population don’t form an elite and thus have no real impact. Neither is there any substance in the frequently heard complaints about the English/US influence. It’s a mild zephyr compared to the Low German hurricane of the 14th c. – simply because the Hollywood and Coca Cola people in spite of being fancied never even attempt to speak the “language of honour and heroes” (and thus don’t affect it). It may therefore look like a paradox, but the best way to protect Swedish is to teach all citizens English.

    Finally, for the sake of justice, I ought to mention, that modern Swedish of course is a north Gmc language and not a west ditto: Of the 100 most frequently used words in everyday speech or writing 94 are directly inherited from old Norse and only 6 are (Low German) loans. “Most frequent” in a language are however words like: and, but, I, you, in, on etc. – i.e. “functional” words rather than “substantial”.

  22. David Eddyshaw says:

    @Hans: No, I wasn’t familiar with this. Thanks for the link. It’s an ingenious idea – he basically seems to posit a sort of death by a thousand cuts of British communities under English rule as impoverished individuals move out one by one into overwhelmingly English-speaking environments and their language vanishes person by person. I’m not altogether persuaded, but given the extraordinary mismatch between the genetics and the archaeology on the one hand and the linguistics on the other, it seems a more plausible explanation than many. On the other hand, I can’t see how it could work unless Brythonic speakers were a minority already, and how that came about is the very thing that needs explaining in the first place.

    Interesting to see him picking up on the time gap issue. (It’s not just me, then…) It hadn’t previously occurred to me that the do-periphrasis thing etc, as features (in fact) of *late* Middle English and Modern English, might be ascribed to widespread immigration of Welsh speakers etc into England *then*! I can think of all sorts of objections to this (to say it yet again, Welsh doesn’t *have* a distinctive progressive) but the idea is certainly interesting.

  23. Stefan,
    “My guess is that Old→Middle→Modern English is in all essentials due to the Norman invasion. ”

    The progressive sure isn’t. In fact most of the English verb system isn’t, except for the substitution of -s for -th in the third present.

    Y, yes! That’s her name. It was a different paper but probably the same thesis.

    David, do you think the development of the progressive looks like a Sprachbund phenomenon? It’s not unidirectional borrowing except in some Scotch-Irish influenced varieties of AmE, where you do see the progressive edging the simple present out, e.g “What are you wanting to do?” and “Don’t be doing that.” or “Don’t go doing that.”

    ” Old Irish, for example, shows traces of an older SOV order, and many of the peculiar features shared by Goidelic and Brythonic, like the system of “mutation” of initial consonants, cannot go back to a shared protolanguage but must have arisen separately in each branch,”

    Old Irish and its development is a black hole. It’s not at all clear when the language came to dominate the entire island or what the historical process was, or what the mix was, or how the various lects influenced each other or to what extent.

    ” as it can’t go back to the protolanguage it presumably shows that there must have been a great deal of intimate language contact in Britain itself between Goidelic and Brythonic after they split, which is not too easy to understand historically.)”

    How hard can it be? The geographical separation is tiny. In Patrick’s time there was continual raiding along the west coast and this may have always been an issue, even during the Roman presence, which was not all that present along the west coast.

    ” but at the very least it would seem odd that so many of the Britons in Roman times had identifiably Celtic names if they were all running off home to speak proto-Old English to each other.”

    This presumes we are talking about the same people. It would be like denying that anyone in Transylvania spoke a Romance language because all the attested names of aristocrats were Hungarian. After all, which local people are likely to have been the ones interacting with the Romans?

    “It was jolly clever of all those Old English scribes to maintain a four-case system in their written English for all those centuries when their spoken language lacked case altogether … ”

    Italian and French scribes managed to be just as clever for centuries.

    This issue is never going to be resolved until someone invents time travel. The trail Is too cold and the evidence is either gone or hopelessly contaminated.

    It would be a breath of fresh air if people – and preferably non-native! – were willing to look at these languages with the same dispassionate objectivity that they look at American languages, because then they might begin to see a lot that had been invisible before. in particular I would bet they would find that people have been overlooking a lot of shared lexicon, but that’s just a guess.

  24. David Eddyshaw says:


    The -s of the 3rd sg in verbs is a Northern dialect thing (of deeply obscure origin) which has elbowed out -th in standard English only since Shakespeare’s time. It can’t possibly be due to Norman influence. You must have meant something different and I’ve misunderstood.

    Certainly Welsh and English are part of a Sprachbund. The modern Welsh verbal system really does share a lot structurally with English, despite what I’ve been saying. Ironically it seems most strikingly different exactly where the devotees of substrata claim that it has influenced English …

    True enough about Transylvania. On the other hand, even given the extreme sparseness of classical sources on Britain, you’d think someone would have remarked on it if all the posh people spoke a totally different language from the great unwashed. Still, that’s probably the least serious objection to the idea that people spoke English in Roman Britain.

    The maintenance of the case system in Latin by French or Italian scribes is not really parallel to the supposed maintenance of the four-case Old English system by supposedly aptotic scribes. Writers in Latin were consciously writing in a language with a settled grammatical tradition (including aggressive peevery about “corruptions”.) There are contemporary OE grammars, admittedly, but it seems a bit of a stretch to suppose that written OE stood in a similar relation to the scribes’ spoken language as Latin to Italian. As a matter of fact, OE scribes *did* increasingly make mistakes, as the case endings lost their distinctive vowel qualities. This makes perfect sense if the endings all began to sound identical in their speech, but makes no sense at all if all along they had trouble with case because it wasn’t a feature of the spoken language anyhow.

  25. Re Brythonic substrates; what are the earliest attestations of the periphrastic forms in the Celtic languages?

  26. Here’s a more recent survey of work on Celtic substrates.
    I am no expert on any of this, but impressionistically I’ll say that the Celtic substratum evidence, state of the art though it may be, looks kind of meager; but it’s good that it’s getting serious attention. The standing explanation for why English grammar looks different from the rest of Germanic is, roughly, “just because”, or “other Germanic languages re doing it too”, which I don’t find convincing either.

    BTW, McWhorter wrote some years ago a paper called “What happened to English?” in which he advanced with much detail an argument that English got so different in the aftermath of incomplete shift of Norse speakers to English. In a later paper, “What else happened to English? A brief on the Celtic hypothesis” he accepts Celtic substrate as a source for periphrastic do. I haven’t studied the papers in detail. The later paper, from 2009, appeared in a special issue of English Language and Linguistics devoted to the Celtic hypothesis.

  27. David Eddyshaw says:


    Old Welsh is preserved only in a few fragments, so it’s hard to say much about that; Middle Welsh is said to begin about 1100, but the issue is confused by the fact that a lot of material which must have been composed earlier survives only as reworked by successive scribal copying and “emendation” in Middle Welsh guise, with fewer or more archaisms thrown in.

    Having actually got up and consulted a Middle Welsh grammar, I see to my shame that I have been guilty of a fairly crucial misstatement:

    As I said, *Modern* Welsh uses the periphrasis with “be” for simple present and continuous; the old synthetic present is now future.


    In *Middle* Welsh the simple synthetic present can be used for simple present, continuous present or future; however (this is where I’ve been lying to you all) the periphrasis with “be” is really only used for continuous present (and sometime the sense “has been X-ing”.) The facts are similar for the synthetic imperfect versus the periphrasis with the imperfect of “bod”.

    So Middle Welsh can (though it doesn’t always) make an aspectual distinction not found in Modern Welsh, and I was WRONG. WRONG.

    Passing along quickly, both Modern and Middle Welsh have a construction with “be” + wedi “after” + verbal noun with perfect sense:

    Mae ef wedi canu “he has sung” (he is after singing)

    which doesn’t have a (standard) English analogue.

    Lastly, Middle Welsh is also fond of periphrases with the “do” verb gwneuthur + verbal noun in what is confusingly called the “abnormal order” (it’s abnormal only from the standpoint of Modern Welsh; in Middle Welsh it’s essentially unmarked.) The construction is commonest in narrative, with the verb in the preterite:

    Mynet a oruc Padric y Iwerdon “Go + relative + did Patrick to Ireland” = “Patrick went to Ireland.”

    It doesn’t have the emphatic sense of modern English positive do-periphrasis, i.e this doesn’t mean “Patrick *did* go to Ireland.”

    I don’t know much about Old Irish syntax (I get the impression that I am not alone.) For what little my opinion is worth there doesn’t seem to have been anything much like the Welsh (or English) periphrastic forms with “be” or “do” in that phase of the language.

  28. David Eddyshaw says:

    @Y: thanks yet again.

    I’ve got as far as page one and seen

    “No records speak of a battle in which the Germanic invaders were victorious over the Celts [in the 5-6 centuries]”

    Not in English maybe, boyo:

    I shall, however, read on, chastened by my errors.

  29. David Eddyshaw says:

    Not before time remembering that Welsh is not all there is of Brythonic, I discover that modern Breton has a continuous present

    Emaon o vont “I’m going”

    where emaon is the “punctual” (sic) aspect of the “be” verb and “o” is said to be a particle which combines with the following verbal noun to make a “participle.”

    This is from the Mouton Grammar Library book on Breton by Ian Press, which dates from an age when the descriptions were a good bit sketchier, and the author doesn’l give any clear contrasts with a simple present; moreover one of his other examples is

    emaon o chom “I reside”

    which is not continuous in the same (progressive) sense.

    However taking this as given, and given that this can hardly be due to French influence, it does suggest a more venerable date for periphrastic continuous present constructions in Brythonic. But then, the construction is actually different from the Welsh, and the development is one which is cross-linguistically common enough anyway, as Y’s latest linked paper says.

    There’s also a fair bit about the “do” verb, ober, as auxiliary

    Labourat a raen gant va mamm er parkeier “I used to work in the field with my mother”

    which is very similar to the Middle Welsh construction; and the book specifically gives

    Gwelout a ra Yann e vignonez “Yann sees his girlfriend” (See + relative + does Yann his girlfriend)

    as the best candidate for an unmarked construction, which is just like the so-called Abnormal Order of Middle Welsh. Given that the refugees from Great Britain to Brittany left well before the Middle Welsh period, this is interesting, and suggests that at least this construction is pretty old.

  30. Fascinating stuff, and brings back my Middle Welsh class of forty years ago! (“Pwyll Pendeuic Dyuet a oed yn arglwyd ar seith cantref Dyuet…”)

  31. David Eddyshaw says:

    Continuing my tour of Brythonia, Wella Brown’s “Grammar of Modern Cornish” has the same “Abnorrmal Order” construction:

    an diogyon a werth leth “the farmers sell milk” (the farmers + relative + sell + milk; Insular Celtic is VSO)

    and he specifically identifies this as the least marked construction.

    He has also

    Gwertha leth a vynn an diogyon “The farmers will sell milk” (Selling [of] milk + relative + wish + the + farmers)

    but says this is emphasizes the selling of the milk, and there seems to be nothing like the unmarked verbal noun + relative + aux of Middle Welsh or Modern Breton.

    Presumably all this is based on reconstruction from Middle Cornish, but I’ve no idea how faithfully it reflects the way the language then worked. I imagine even then it was moreover exposed to a lot of English influence, vastly more than the Welsh of that time, anyhow.

  32. David Eddyshaw says:

    In his Modern Cornish, Brown explicitly equates the construction “be” + orth “at” + verbal noun with the English continuous present in contrast with the synthetic present which he equates with the English plain present:

    Y’n varghas ma y pren ow gwreg hy losow “In this market my wife buys her vegetables”


    Y’n varghas ma yma hi ow prena hy losow” “In this market my wife is buying her vegetables” (His translation, surely wrong: it must mean “In this market she is buying her vegetables”; the “ow” is a form of “orth” in this sentence, unlike the previous one in which it is the word for “my”

    Again it’s hard to know what to make of this, as the language is a scholarly construct on the basis of a living speech which was presumably already very much influenced by English.

  33. David Marjanović says:

    If Old English was a dead written register, why are there distinct regional dialects of it?

    some form of “do” periphrasis is very well established in almost all Germanic dialects, including Dutch dialects (although highly stigmatized in standard Hochdeutsch).

    In the kinds of German I’m familiar with, its functions are completely different from the English ones, though:

    1) To fill both of the verb slots in the unmarked word order for simple declarative sentences: finite verb form in the 2nd place, infinite verb form in the last place.
    2) Maybe to avoid vaguely rare inflected forms, though maybe that’s just the attraction of 1) shining through again. I hear that happens in English in the past tense.
    3) To make a topic-and-comment sentence by extracting the topic out of the finite verb form and moving it to the beginning of the sentence, while the disembodied inflection stays behind in the 2nd place and is embodied by “do”.

    An important subset of 2) are prefixed verbs whose prefix is a noun. Those are backformed from compound nouns: “emergency landing” is Notlandung, and “to perform an emergency landing” is notlanden. Such prefixes are – get this – neither separable nor inseparable. One does not simply render “so, when are we gonna have our emergency landing?” as *wann landen wir Not or as *wann notlanden wir; the only way to say this is wann tun wir notlanden. Consequently, these verbs as a class are avoided in writing, except when their infinite forms can occur without “do”.

    Old Irish, for example, shows traces of an older SOV order

    Then maybe VSO comes from the Phoenicians?

    but at the very least it would seem odd that so many of the Britons in Roman times had identifiably Celtic names if they were all running off home to speak proto-Old English to each other.

    What about place names? Is there a part of England that lacks Celtic place names?

    Gwelout a ra Yann e vignonez

    Sa mignonne ! ^_^

  34. David Eddyshaw says:

    It occurs to me that the particle “o” in Breton in

    emaon o vont “I’m going”

    which in the glossary of Press’s book is given as a form of ouzh “against, at”

    must be the same etymon as the Cornish orth/ow in

    yma hi ow prena “she’s buying”

    so it’s possible that this construction is older than the split of Cornish and Breton; but the word corresponds to Welsh “wrth”, not to the “yn” which is used in the Welsh periphrastic forms with “be.” [Cornish and Breton are in general closer to each other than Welsh, unsurprisingly.] So the construction is at any rate not a straightforward inheritance from any common Brythonic protolanguage. However, according to Simon Evan’s Middle Welsh grammar the changes which transformed “British” into distinct Welsh, Cornish, and Breton date to about 450-600, so the different periphrastic “be” constructions must have arisen after that, but that’s still comfortably early enough for the believers in the Celtic-to-English spread of the feature. There would just have been more than one possible model for the aspect-challenged Saxon to copy.

  35. David Marjanović says:

    the only way to say this

    Uh, actually, the future tense – wann werden wir notlanden – would solve the problem in this particular case. I find it downright unidiomatic here, though; and apparently I’m not the only one, because I didn’t come up with the notlanden example myself (I just can’t remember my source, as usual, grmbl).

  36. David Eddyshaw says:

    @:David Marjanović

    Outside the areas where Celtic languages uncontroversially persisted long after the advent of the Anglosaxon menace (like Scotland and Wales and Cornwall) UK place names are overwhelmingly Germanic, though there are certainly exceptions, notably many river names, and of course the big cities of Roman times, like Llundain and Efrog. (We preserve their true names against that day when we finally drive the Englishman back to Jutland, where he belongs.)

  37. One does not simply render “so, when are we gonna have our emergency landing?” as *wann landen wir Not or as *wann notlanden wir; the only way to say this is wann tun wir notlanden.

    Since these verbs seem to be more or less not worth having, why no do without them and say “Wann machen wir eine Notlandung?” (Disclaimer: I don’t really speak German.)

  38. What happened to English?

    100 AD – Britons learn Latin
    500 AD – Romano-Brittish learn Anglo-Saxon
    900 AD – Anglo-Saxons learn Norse
    1100 AD – Anglo-Saxons learn Norman French
    1500 AD – British elite forgets French and learns English

  39. 1900 – English learn American

  40. In modern Welsh the same form is used for both

    That’s what you expect when there are two forms meaning exactly the same thing: either they become semantically differentiated (like “I go” vs. “I do go”, which meant the same in Early Modern English), or one is dropped (like the preterit in spoken French or German). The progressive forms in English actually underwent both changes depending on the semantics of the verb: “I sing” and “I am singing” are semantically different, but “I want” is correct whereas “*I am wanting” is not (except in Indian English).

    Vaupes in Amazonia

    And still more in the famous (to linguists) village of Kupwar on the Maharashtra-Karnataka border in India, where Marathi, Kannada, and Urdu varieties were all spoken with a completely merged morphosyntax, intertranslatable word by word and morpheme by morpheme, but with three completely separate vocabularies. (Recent studies show that this situation is now breaking down, with dialect and language shift toward standard Marathi.) Here’s an example of Marathi/Kannada direct matching from Gumperz & Wilson’s original 1971 study:

    Kannada: hog- i wənd kudri turg maR- i aw tənd
    Tags: verb suff. adj. noun noun verb suff. pron. verb
    Marathi: ja- un ek ghoRa cori kar- un tew anla
    Gloss: go having one horse theft take having he brought
    English: Having gone and having stolen a horse, he brought it back.

    Unfortunately, I don’t have standard Marathi and Kannada sentences for comparison.

    jolly clever of all those Old English scribes

    No worse than all those French, Latin, Spanish, etc. scribes being able to write Latin with its five cases despite speaking caseless languages.

    Ancient Egyptian had been spoken in Britain

    Indeed, there are a lot of Semitic/Afroasiatic parallels with Celtic typologically, which has given people to wonder about voyages from the Eastern Mediterranean to the Tin Isles ever since Morris-Jones. Recent survey paper on the idea.

  41. initial mutation thing really is strange enough

    Not so strange phonologically, just intervoGallic lenizhon. If anything is strange, it is the grammatical use then made of this phonological change.

    OE scribes *did* increasingly make mistakes

    Well, so did Leonese scribes in the 10C, whose Latin is so debased that it can be read off as Ibero-Romance, or even modern Spanish, with a very basic understanding of Latin case endings.

  42. J. W. Brewer says:

    I think we are at present as confident as we can be of anything at that time depth (and without written records) that whenever the first proto-Celtic speakers arrived in Great Britain (and the timing of that development is itself disputed) there was a substantial pre-existing hyman population (the descendants of whom, the genetic record suggests, eventually shifted to Celtic rather than being destroyed by massacre or epidemic) that spoke a non-Celtic language (or languages) that was/were *probably* non-IE but is/are otherwise entirely lost to us (if we go with the now dominant theory that Pictish was Celtic rather than a pre-IE survival). So one could expect substrate effects from that/those unknown language(s), which well could have resembled “Hamito-Semitic” in various typological ways without necessarily being genetically related.

  43. David Eddyshaw says:

    @John Cowan:

    “If anything is strange, it is the grammatical use then made of this phonological change.”

    Well, yes, that’s exactly what I mean. The phonological processes underlying the “mutations” are straightforward and common (but, note, *different* in Goidelic and Brythonic); but their persistence after the conditioning word final sounds had been lost, and subsequent grammaticalisation, are pretty unusual. The only other example that occurs to me offhand is in the Southwest Mande languages, where it’s a lot simpler than in Insular Celtic. No doubt there are other examples, but it’s hardly common in the world’s languages. It really cannot be coincidence that Goidelic and Brythonic both have this, and it cannot be an inherited feature, so the “idea” of it must have spread even though the details of how it works in the two branches are incommensurable.

  44. And French, where we have one n-ami, two z-ami, et cetera.

  45. David Eddyshaw says:

    Although similar phenomena involving tone are quite common: in many tone languages words can end in tonal features which become evident only in their effect on the tones of the following word, and such features are certainly grammaticalised in some cases.

    And I suppose the famous “final features” in Numic languages (Sapir on Southern Paiute is the locus classicus) are bit similar, though they are features of individual morphemes and not grammaticalised independently IIRC.

  46. David Eddyshaw says:

    @minus273: yes, I suppose that is a small-scale example of a similar development.

    Talking of liaison, I’ve always been impressed by how standard modern French has two absolutely identical silent h’s, with different sandhi.

  47. David Eddyshaw says:

    … which, now I think of it, is another good example of the effect of inter-word phonological change surviving the total loss of the original conditioning factor. It’s just the other way round from Insular Celtic, with the original initial distinctions in the following word being lost instead of the original final distinctions in the preceding word.

  48. Indeed. I think we think of initial mutation (as distinct from final mutation) as being bizarre partly because it violates what Tolkien in “English and Welsh” called our Germanic feeling:

    Though it may be noted that many of the things that strike the modern Saxon as insuperably odd and difficult about Welsh have no importance for the days of the first contacts of British and English speech. Chief among these are, I suppose, the alteration [sic; alternation?] of the initial consonants of words (which revolts his Germanic feeling for the initial sound of a word as a prime feature of its identity), and the sounds of ll (voiceless l) and ch (voiceless back spirant). But the consonant alterations are due to a grammatical use of the results of a phonetic process (soft mutation or lenition) that was probably only just beginning in the days of Vortigern. Old English possessed both a voiceless l and the voiceless back spirant ch. [footnote 12]

    Anglophone linguists, of course, are not immune to this, and it also interferes with our habits of alphabetizing words.

    It’s not clear whether Tolkien actually believed that Old English hl (generally thought to be an approximant) and Welsh ll (definitely a fricative) are the same sound, or whether he was simplifying for the sake of his audience; the paper is not a technical one.

  49. David Eddyshaw says:

    Thinking about it more, I’m not totally sure that the mutation thing couldn’t possibly be projected back to the time before Insular Celtic split into Goidelic and Brythonic.

    On the face of it, it’s difficult to square the fact that (for example) original postvocalic stops are voiced in Brythonic but turn into voiceless spirants in Irish, but one could imagine that the relevant change in the protolanguage was simply lenition rather than either voicing or spirantization, and this then developed differently in the two branches; while original voiceless geminates would remain (or become) fortis, and only after the breakup into Brythonic and Goidelic would the postvocalic fortis stops become spirants.

    The actual grammaticalisation of the “mutations” plainly belongs in any case to the later development of the individual languages; in Old Irish to a great extent the system simply works as if the preceding word-final vowels and so forth were still there, and is just beginning to be transformed by analogy; in Middle Welsh the process is already much farther along, but you can trace its further extension within the language as time goes by (so that, for example modern spoken Welsh now basically has a rule that the initial consonant of the word directly following the subject of a finite verb gets the “soft mutation”, which has resulted from massive generalisation from the frequent instances where this would have followed from the original phonologically motivated rule, but is now completely inexplicable in such terms.)

    So the only thing which would have to be projected back to common Insular Celtic would be an unusually marked tendency to treat entire phrases as single phonological units. In fact, given that the loss of original final vowels certainly postdates the Goidelic/Brythonic split, maybe even my speculation about “lenition” is redundant.

    Quite how in proper linguistics one could formalise “an unusually marked tendency to treat entire phrases as single phonological units”, I don’t know, though. Perhaps my forbears were just prone to gabbling.

  50. David Eddyshaw says:

    I meant:

    “….and only after the breakup into Brythonic and Goidelic would the postvocalic fortis stops become spirants IN BRYTHONIC.”

    In conclusion, the mutation system of the Insular Celtic is the inevitable consequence of the Gift of the Blarney and Welsh Windbaggery; it is not a feature of English because the poor Saxon is a tongue-tied fellow who sits moping in the corner counting his change while we get on with the craic.

  51. Just to complicate matters, Peter Schrijver, Celtic influence on Old English: phonological and phonetic evidence (English Lang. Ling. 13.2:193–211, 2009) argues, on phonological grounds, that the Celtic substratum was more like Old Irish than Brythonic. The scenario he describes is,

    The original language of the shifting population can now be identified as a variety of Celtic which was ancestral to Old Irish. Old Irish is the product of a linguistic colonization of Ireland which had its roots in a variety of British Celtic spoken in Britain sometime in the late first or early second century AD. Descendants of the speakers of this variety of Celtic who stayed behind in Britain eventually became Roman citizens. Some of them, especially those living in the eastern and southern Lowland Zone and being part of a highly Romanized society, stood a good chance of becoming bilingual Celtic-Latin speakers. Some ultimately shifted to Latin altogether. Given the evidence for phonetics surviving multiple language shifts, these new Latin speakers to a lesser or greater degree held on to their particular Celtic phonetics. The phonological history of Old English indicates that these, either Celtic speakers or Latin speakers with a strong Celtic accent, were the people whom the first Anglo-Saxon settlers met and who left a characteristic trace in Old English phonology and phonetics.

    Ultimately, there is enough leeway to allow one variety of British Latin to influence British Highland Celtic and quite another to influence Old English. Out of a multitude of possibilities, one might arbitrarily select one according to which it was city folk and the rural elite that fled to the Highland Zone, because they had the means to do so or because the collapse of towns left them no alternative. Their comparatively upper-class Latin would then have influenced Highland British Celtic, which became Welsh, Cornish, and Breton. The rural poor, small farmers and agricultural laborers, may have stayed on, hoping to strike a deal with the new powers, and in so far as they succeeded, they would have imported into Old English a comparatively lower-class Latin accent, phonetically similar to the original Celtic.

  52. David Eddyshaw says:

    I would say this is somewhat less securely based than even my theory about mutation and the Blarney Stone.

  53. I think we can all agree the truth lies somewhere in between.

  54. David Eddyshaw says:

    Just thinking again about initial mutations.

    I can’t see any actual mileage in the idea that Insular Celtic had “an unusually marked tendency to treat entire phrases as single phonological units”; what language doesn’t, really?

    In order to end up with an initial “mutation” system where most initial stops regularly become altered after certain words which used to end with a vowel (say) but no longer do, you must have at least

    (a) an original system where *most* single consonants after vowels have markedly different allophones from elsewhere
    (b) loss of word final vowels (duh)
    (c) the allophones in (a) get reanalysed as different phonemes – but not *too* different, or it’s hard to see how an actual mutation *system* could survive, with no possible psychological connexion between the mutated and plain consonants; similarly, there can’t be too much falling together of the reflexes of originally distinct consonants.

    Spanish, for example, has (a), even between words, but only for voiced stops, and for the most part has not undergone (b); Masoretic Hebrew has a complicated system like (a), including voiceless stops, but never lost the conditioning final vowels either, and in any case in the Tiberian system itself the opposition between stops and the corresponding spirants is only just barely contrastive (mostly because of complications with “silent” schwas.)
    French has happily done violence to postvocalic stops of all sorts all over the place, and has largely lost original final vowels, but the smooshing of postvocalic consonants is so profound that if it were carried out word-initially mayhem would ensue.
    Germanic voiced stops had spirant allophones after vowels, but these have mostly not separated out as separate phonemes, and there’s nothing similar for other consonants.

    Perhaps there actually just aren’t all that many cases where (a) (b) and (c) have all been fulfilled, which would then help explain the rarity of full-fledged grammatical initial mutation systems in the world’s languages.

    So the distinctive peculiarity of Insular Celtic might have been just that its consonants showed a lot of systematic unusually marked allophony. I can imagine this could be formalised as something which could antedate the Brythonic/Goidelic split, and it’s at least something that undoubtedly does vary a lot between languages.

  55. David Marjanović says:

    I’ll need to write more later.

    or one is dropped (like the preterit in spoken French or German)

    In before Stu: that’s specifically Upper German, with a bundle of isoglosses at the White-Sausage Equator where, as you go from north to south, one verb after another loses its passé simple.

    I can’t see any actual mileage in the idea that Insular Celtic had “an unusually marked tendency to treat entire phrases as single phonological units”; what language doesn’t, really?

    German. 🙂 From Early Modern, if not Middle, onwards, it has had an unusually strong tendency to emphasize phonological words (including the components of compound nouns) over both phrases and syllables.

    My dialect turns /b/ between vowels into /v/. This seems to be a rather recent development*, and I think it’s even somewhat optional in that retaining /b/ between vowels, though rarely heard, doesn’t sound outright wrong. At the end of a word, /b/ does turn into /v/ if a clitic or perhaps any personal pronoun that begins with a vowel follows; if another vowel-initial word follows instead, /v/ sounds wrong to me after trying it a few times.

    * It must have happened after at least one round of apocope that is not shared with Standard German.

  56. the poor Saxon is a tongue-tied fellow

    Maybe in the last century or two; that was certainly not the case in early modern times,, when verbal exuberance even on one’s deathbed was the mark of a proper Englishman (e.g. “No: ’tis not so deepe as a well, nor so wide as a Church doore, but ’tis inough, ’twill serve: aske for me to morrow, and you shall find me a grave man”). Though I grant you that even then the Welsh had the better of it: “If the Enemie is an Asse and a Foole, and a prating Coxcombe; is it meet, thinke you, that wee should also, looke you, be an Asse and a Foole, and a prating Coxcombe, in your owne conscience now?” It’s said that this “think you” and “look you” gave the Welsh time to think of “the English for a thing”.

    while we get on with the craic

    A word borrowed, by the way, from English/Scots crack. The OED has citations from Burns, Thoreau, and various now-obscure English authors of the 19C and 20C. It doesn’t get into Irish until the mid-20C.

  57. David Eddyshaw says:

    @David Marjanović:

    You’re right, certainly, that languages do vary in the degree to which phrases are treated as single phonological units. Nobody would say that French and German are the same on this measure, for example. And I suppose this shades into the whole question of just how you define “word”, as the morphological and phonological evidence may not coincide.

    Perhaps I was too hasty in dismissing this.

  58. Great point! In French too, the initial consonant insertion must have a link with the language’s prosody, in which word boundary is definitely dwarfed before the boundary of longer prosodic phrases.

  59. I’ve told this story before: It was the inability to hear word boundaries that caused an acquaintance of mine, who’d gotten all As in German, to flunk out of Spanish at DLI in 1969.

  60. David Eddyshaw says:

    By sheer cosmic coincidence I just came across this fragment of the writings of Cato the Elder:

    “pleraque Gallia duas res industriosissime persequitur, rem militarem et argute loqui”

    “Most of Gaul chases after two things particularly earnestly: military matters and witty talk”

    Mutation is bound to follow.

  61. David Marjanović says:

    I’ve told this story before: It was the inability to hear word boundaries that caused an acquaintance of mine, who’d gotten all As in German, to flunk out of Spanish at DLI in 1969.


  62. David Marjanović says:

    Outside the areas where Celtic languages uncontroversially persisted long after the advent of the Anglosaxon menace (like Scotland and Wales and Cornwall) UK place names are overwhelmingly Germanic

    Sure; is there an area where Celtic names are completely absent?

    (Probably not, because that would probably be a famous argument by now, but I don’t know.)

    Since these verbs seem to be more or less not worth having, why no do without them and say “Wann machen wir eine Notlandung?” (Disclaimer: I don’t really speak German.)

    But what if someone cuts you in traffic and ought to be emergency-slaughtered (like livestock with a fatal injury or incurable disease)? Den sollte man notschlachten

  63. Just a data point – I grew up in the Rhine area and Northern Germany, and for me it’s always eine Notlandung machen, I wouldn’t use tun here. On the other hand, using tun + infinitive was a mark of colloquial speech and very frequent in Ostfriesland, where I went to elementary School (in my memory, this was almost the unmarked way to form the present tense among kids my age); every time our teachers would hear someone say er tut schreiben “(“He does write”), they’d correct it and remark “tuten tut die Feuerwehr (“tooting is what the fire brigade does”). As I don’t have frequent contact with elementary school children, I don’t know whether they still do that and whether this construction also is used in other parts of Germany; I normally don’t hear it from adults, perhaps because school has managed to suppress it among the type of professional, well-educated GermansI normally converse with.

  64. David Eddyshaw says:

    Interesting paper, and certainly germane to the question.

    It’s not very clear to me just what the distinction between “word” languages and “syllable” languages is in his scheme, though. He mentions inter-word sandhi phenomena like retroflexion and r-loss in Swedish, but most of what he talks about is loss of quality distinctions in unstressed syllables (incidentally what he says about Gothic in this regard is quite wrong; the short e and o of Gothic stressed syllables are evidently just allophones of i and u.) Mysteriously he tags Danish rather than English as the Germanic language farthest along this path; and makes what I would have thought is a surprising suggestion that Swedish has resisted this because of conservative spelling, which would be an unusual phenomenon.

    It isn’t hard to think of languages with little in the way of obvious word breaks that don’t reduce unstressed syllables: the Spanish that foxed Rodger C’s friend comes to mind, versus the German that didn’t.

    In context of Insular Celtic, Brythonic and Irish are as different as can be on this axis: Brythonic lost all final syllables altogether but pretty well preserved the vowels of others, while Irish initial word stress caused vast changes in non-initial vowels (leading to an Old Irish verbal system of well-nigh Athabaskan opacity.)

  65. David Eddyshaw says:

    Scratch my stupid example of Spanish vs German, which of course supports rather than rebuts the author’s thesis. Perhaps it’s harder than I thought. Czech vs Cantonese?

  66. David Eddyshaw says:

    I suppose that a language with a strong word stress, of a degree to which the loss of vowel quality distinctions in unstressed syllables may be attributed, will ex hypothesi have a strong word stress which will then aid the listener in identifying individual words. So you would expect a correlation. (Especially if it’s typically on the first syllable.)

    The test cases are going to be languages with a strong word stress but little reduction in distinction in unstressed syllables. Are they normally languages in which word boundaries are fairly difficult to hear, or with extensive inter-word sandhi? More so than languages (like French) with little or no word stress?

    I can easily think of languages like that (Hungarian, Czech …) but I can’t offhand think of one I’m familiar enough with to have much idea of the difficulty of identifying single words.

  67. David Eddyshaw says:

    I could have been rather clearer: what I’m implying is that it’s the existence of word-level stress, rather than any secondary phenomenon like loss of vowel quality distinctions in unstressed syllables, which would mark a language as word-centric rather than syllable-centric; and one would be encouraged in thinking that this is a meaningful distinction if languages with word-marking stress also tended to show less inter-word sandhi phenomena.

    If the absence of word-marking stress tends to go with a blurring of word margins more generally, then it may be significant that the position of the stress was completely different in old Goidelic (initial) and Brythonic (penult, before the loss of final vowels) which perhaps suggests that the common protolanguage did not itself use stress as a word marker. (This would also go with the extensive cliticisation of personal pronouns to verbs and, more remarkably, prepositions found in all Insular Celtic, come to think of it.)

    My object in all of this is to try to think of what Goidelic and Brythonic might possibly have inherited from their common parent which could have led each branch independently to have developed their very unusual, complex, systematic, parallel but incompatible mutation systems. Even making all possible allowances for the fact that languages don’t in reality “split” cleanly in orthodox Junggrammatiker style, and supposing a lot of early cohabitation of the languages concerned, it seems such an unusual feature that I can’t shake the notion that there must have been something unusual in Insular Celtic before the split that made these developments much more likely. How would you *borrow* a system of initial consonant mutations? And in a way which only makes sense given the previous phonological history of your own language rather than the donor language?

  68. David Eddyshaw says:

    The conjugated prepositions of Insular Celtic [eg Welsh arnaf “on me” arno “on him”], now I think of it, are a strong independent piece of evidence that the protolanguage was a poor respecter of word boundaries.
    They too are unique to Insular Celtic within Indoeuropean (AFAIK) and they too can’t be reconstructed back en bloc to the common ancestral language but must have arisen independently in Goidelic and Brythonic.

    I was slow to think of this because I thought of conjugated prepositions as something that tends to fall out of VSO typology anyway (as in Afroasiatic) but that doesn’t account for the fusion of the elements, and it is not closely parallel to the possessive construction for nouns, as it is in Semitic, say.)

    Back to the Blarney.

  69. They have a marginal existence in Latin me:cum, te:cum, se:cum ‘with me/you/himself’. In Spanish the inherited forms were prefixed with semantically redundant con- to produce comigo > conmigo, contigo, consigo.

  70. Trond Engen says:

    Interesting, but I’m not sure he hits the nail. I’ve been partial myself to the hypothesis of vowel balance as a sprachbund effect, but it would be a stronger hypothesis if the effect were strongest in the northernmost dialects. Also, the fact that word endings are retained, reintroduced or even invented through written language is true, but I can’t see that it shows that otherwise words really would be reduced to syllables. And he completely avoids what I thought the paper would be all about, the Central Scandinavian agglutinative verb:

    Å-va-re-a-sa-ræ? “What was it she told you?” = “What did she tell you?”
    Ga-ru-a-n-a? “Gave you her it then?” = “Did you give it to her, then?”
    Jæ-kan-ke-se-no-n-vi-ha-me-sæ. “I can not see some-/anything he will have with him.” = “I can’t see anything he’d would bring along.”

  71. David Eddyshaw says:

    True, I’d forgotten mecum etc.

    The Celtic forms differ are more tightly fused than the Latin, and the phenomenon applies to almost all the common prepositions with enough idiosyncrasies that there are several different “conjugations”, as I’m sure you know.

    I’m beginning to come to the conclusion that the Insular Celtic protolanguage did indeed have a striking lack of phonological marking of word boundaries, and that that does have some bearing on the rise of the mutation systems later; but I think this is still a necessary rather than sufficient condition – must be, or such systems would be commoner.

    There are plenty of languages with little phonological demarcation of morphological words in the world. I suppose that the subset which have gone on to lose word-final vowels consistently is a good bit smaller, though, common though that process is; and of that group I supposed fewer still would have had a striking degree of allophony of consonants depending on whether they followed a vowel or not. Moreover languages with little phonological marking of word boundaries might be exactly those least likely to lose word-final vowels (no word stress to encourage this) and maybe they are also less likely to display the sort of allophony in question in the first place, too.

    Whatever happened, the whole sequence of events must have been a relatively unlikely one, or the world would be full of languages with grammaticalised initial mutation systems. Maybe the rare event was simply that the various initial consonant changes were not simply all promptly levelled away by analogy – and that I can imagine as something that might diffuse across a Sprachbund, unlike a whole mutation system.

  72. David Eddyshaw says:

    It seems, on reflection, very probable that a language where the morphological word is not clearly marked phonologically would be less likely to lose word-final vowels (in fact in a sense that’s almost tautological. How would they “know” they were word-final?)

    To a much lesser degree, it seems plausible on first principles that such a language might also be less likely to show striking allophony in consonants after vowels compared with elsewhere. Most of the languages I can think of of this type tend to have a marked preponderance of open syllables, which in a language where the syllable rather than the word is the basic phonological unit would make such allophony unlikely; per contra, languages with many closed syllables seem likely either to have word stress systems based on syllable heaviness or to have developed their closed syllables in the first place by loss of unstressed vowels.

    If all this is right, it might help to explain why developments like grammatical initial consonant mutation are rare. For it to occur, the protolanguage would not only have to be one where the morphological word was poorly demarcated phonologically, but it would have to be typologically unusual for a language of this sort, with many closed syllables; and it would have to undergo loss of word final vowels, which may be comparatively uncommon for such a language.

  73. David Eddyshaw says:

    As I mentioned, the Southwest Mande languages also have a system of regular initial consonant mutations which arose historically from external sandhi.

    In Zialo (based on Babaev’s grammar from LINCOM)

    dapa gbeyã-gi “the red bag”
    masa kpeyã-gi “the red king”

    Synchronically the language has no closed syllables except as formed by the determiner -y. Historically syllables could end in a nasal, as did the word for “king” above (< *maŋsaŋ) which is (historically) why the word for "red" above keeps its "kp"; the "gb" in the first word reflects the post-vowel reflex. The system is still true to its origins in Zialo inasmuch as it makes sense even synchronically to label lexical items as original nasal-final types or not, but many other syntactic and idiosyncratic factors now affect the initial consonant mutations.

    So Proto-SW Mande would also have deviated from my supposed typical blurry-word-margin language type in having many closed syllables.

  74. David Eddyshaw says:

    Mandinka, which is a more distantly related language and which does not have initial mutation, has a tone system where the tonal domain is the whole word; Zialo and Mende, which are the only SW Mande languages I can find adequate descriptions of at the moment, have tone systems where the domain of tone is the syllable.

  75. David Marjanović says:

    Initial consonant mutation: what about Nivkh?

    tuten tut die Feuerwehr

    I know “tun” tut man nicht (“‘doing’ is not the done thing”), though I’m not sure if I’ve ever heard it from a teacher. It was immediately mocked by the self-referential “tun” tut man nicht tun.

    Article on how Swedish and some kinds of Norwegian have switched back from word languages to syllable languages.


    I thought I had a pdf – probably the one cited as Szczepaniak (2007a), or a review of it – which argues that Old High German was moving towards a syllable language as well, randomly inserting vowels into Proto-Germanic consonant clusters, before that tendency was very strongly reversed again in the largely undocumented transition to Middle High German. Example: swimman ~ sowimman “swim”, modern schwimmen with no trace of the extra vowel.

    It’s not very clear to me just what the distinction between “word” languages and “syllable” languages is in his scheme, though.

    They’re extremes in a continuum; there are word-language features and syllable-language features, and many languages have some of both.

    the position of the stress was completely different in old Goidelic (initial) and Brythonic (penult, before the loss of final vowels) which perhaps suggests that the common protolanguage did not itself use stress as a word marker

    I thought it was textbook wisdom that Proto-Celtic had initial stress? This goes so far that the initial stress of Germanic has been blamed on Celtic influence. Granted, Proto-Celtic isn’t the same as Proto-Island Celtic, but I don’t think there can be a lot of evidence on the stress of Gaulish, let alone Lepontic or Celtiberian.

    Perhaps the Brythonic penult stress should be blamed on Latin?

    Freak accidents also happen. Polish has shifted its stress from initial to penult – which for most words is the same thing anyway. One long-extinct East Frisian dialect, and I do mean Frisian and not Low German, shifted its stress to the second syllable and then completely lost the vowel of the first syllable, preserving that of the second – leading to marvels recorded as snuh “son” and kma “come”.

  76. David Eddyshaw says:

    @David Marjanović:

    Someone else came up with this business of proto-Celtic having initial stress a while back – I forget who. I’d be interested to know where it comes from. Offhand I can’t think of any likely way anyone would be able to tell.

    Assuming it’s true (for the sake of argument), if Insular Celtic had *lost* initial stress, that might account for the language ending up as a blurry-word-boundary language with an atypical segmental structure.

    Latin stress is different from Brythonic, which was always penultimate.

    My knowledge of Nivkh is … um … limited. Tell me more!

  77. David Eddyshaw says:

    Fulani is a fairly well-known example of regular initial consonant “mutations”
    (one Pulo, two Fulɓe) but I wouldn’t put it in same category as the Insular Celtic and the SW Mande as it’s confined to marking singular vs plural, (which it does in a very odd way, different in nouns referring to people on the one hand, and other nouns and verbs on the other.)

    Unfortunately the historical origin is not known; on first principles lost prefixes seem a likely candidate. Premodern race-minded linguists and ethnographers got very excited about it, imagining it showed that the Fulani, dominant at that point over much of the Sahel following Usman dan Fodio’s jihad, were more “Hamitic” and hence racially more fitted to rule than their subjects. The idea is, alas, not yet dead, as you can find on Wikipedia if you’re not careful. Ugh.

  78. I don’t think Nivkh counts. The basic process at issue is a kind of lenition-dissimilation in which an morpheme-initial stop is replaced by the corresponding continuant when the morpheme is not initial in its phrase, unless the preceding morpheme ends in a continuant. In some transitive verbs, however, the process seems to run in reverse, with initial continuants undergoing fortition to stops; this grammatical difference has been taken to be parallel to mutation. However, it can be explained (away) by taking the underlying form of these verbs to already contain a stop which is lenited by a now-lost prefix /ɪ/; there are other verbs that continue to have this prefix and are regular.

  79. David Eddyshaw says:

    Ah. Knew I had it somewhere. Ekaterina Gruzdeva’s book on Nivkh does indeed describe wholesale regular initial consonant alternations, but says they are conditioned by the (still present) final segments of the preceding word, in general. There *are* some cases where words or morphemes historically ended in nasal sonants, which have been lost in some dialects but whose effect on the following word is still evident.

    So yes!

    According to Gruzdeva, the changes take place only word internally, with postpositions, and in the combinations attribute+noun and direct object + transitive verb; very interestingly the alternations are not quite the same in these cases.

    This is a fascinating parallel; still not quite the same as the Celtic and Mande, though, where the conditioning finals have been comprehensively lost across the board.

  80. David Eddyshaw says:

    I should know by now that I really needed to do was wait a few moments for a better answer to come from John Cowan …

    So the presumed lost verb prefix /ɪ/, and the lost final nasal segments in Nivkh are analogous in their effects to the lost final nasals of SW Mande, and the lost final practically-everything of Insular Celtic, but the consonant alternations in general are conditioned by segments still very much present. Moreover the cases where it occurs are intra-word, if one cheekily stretches the term by regarding not only noun+postposition as a “word” but attribute+noun and object+verb. Even if this is a bit much, it’s not unreasonable to imagine a particularly close phonological connection in these particular cases without having to dream up a proto-Nivkh in which all word boundaries in general were vague. Just as well – Nivkh doesn’t look a very promising candidate for such treatment …

  81. David Eddyshaw says:

    My argument (that Insular Celtic was a language that lacked clear phonological marking of morphological words, but was segmentally not a typical type of such a language) could quite easily get circular (it must have been like that, because of mutations, and it must have been atypical, because mutation systems are uncommon.)

    So far I’ve got the following non-circular arguments:

    1. The development of conjugated prepositions in exactly these languages suggests previous wobbly word boundaries. VSO languages often have conjugated preposition-like constructions, admittedly, but typically they are more like the possessive constructions of nouns, which in Celtic they most certainly aren’t.

    2. Brythonic and Goidelic differ altogether in the position of word stress, so at least there is no insuperable difficulty in assuming word-demarcating stress was not a feature of the protolanguage.

    3. Most languages with a very notable mismatch between phonological and morphological “word” (ie not just in marginal cases like a closed set of clitic particles) favour a simpler segmental structure with many open syllables (this has the virtue of being falsifiable!)

    The point of 1-2 is to make the scenario of the development of the mutations plausible in this particular case; the point of 3 is to try to explain why it nevertheless hasn’t happened often across the world.

    If it really is the case that of those Mande languages which have lost word-final nasals, those that have initial mutations are more likely to have syllable-based tonal systems and those that lack mutations are more likely to have word-based tonal systems, it would help to make my basic assumptions more plausible. With only three data points, I’ve no real idea, though.

  82. David Eddyshaw says:

    Well found – thanks.

    I think there is maybe a bit too much readiness in this to assume that if two languages develop a similar feature, the explanation must be contact. While this is possible, it becomes progressively less persuasive as the feature is cross-linguistically common in any case, and the more so if there is a lot of doubt over the actual timing of the changes

    With Old Irish in particular, it seems to be clear that the first syllable stress has induced all the major vowel changes we know and love and which make its verbs such a delight. Even if we didn’t have Ogham, the fact that analogical changes have applied so little in OI shows this must be *recent*. Therefore even if the Celtic language did have initial “stress” previously, it probably wasn’t like the stress of Old Irish. I think this is hard to square with any putative effect on Germanic, or even with the idea that this was an areal feature of Western Indoeuropean.

    With something as hard to see in written records as stress, I would be reluctant to believe we can know much securely unless we can see segmental changes which could be attributed to it (as with older Latin.)

  83. “Tun” tut man nicht

    Listen tae the teacher, dinna say dinna,
    Listen tae the teacher, dinna say hoose,
    Listen tae the teacher, ye canna say munna,
    Listen tae the teacher, ye munna say moose.

  84. On Celtic stress: if I correctly recall the discussion of Gaulish stress in Lambert’s “La Langue Gauloise”, Gaulish doesn’t seem to have had initial-only stress. The little that can be deduced is based on the development of French place names of Gaulish origin, where in many cases the stress needed to get the modern French forms is different from what one would expect if Latin stress rules applied, but still in many cases non-initial. I’ll be at home next week with access to my books, if anyone is interested, I can look it up.
    The idea that Celtic was stress-initial seems to me to be on shaky ground; except for Old Irish, we actually have no other Celtic language where Initial stress can be observed.

  85. I’ll be at home next week with access to my books, if anyone is interested, I can look it up.

    I’m pretty sure I’m not the only one who is interested, so if it’s not too much trouble, please do.

  86. David Eddyshaw says:

    I, too, would be very interested, Hans.

  87. David Eddyshaw says:

    As further evidence for blurry word boundaries in Insular Celtic one could perhaps also add the so-called “infixed pronouns”, as in Old Irish

    ro-m-gab “he has taken me”
    no-t-erdarcugub “I shall make thee famous”
    Immu-m-rui-d-bed “I have been circumcised” (ouch)

    where the pronoun occurs between a preverbal preposition or particle and verb, and the whole thing is a phonological word in Old Irish.

    Middle Welsh has similar but less exuberant phenomena:

    neu-s rodes “he has given it”

    Although this fits the pattern, it’s a lot easier to parallel outside Celtic than conjugated prepositions, of course (one thinks of all those proclitic pronoun chains in front of French verbs …)

  88. David Eddyshaw says:

    Mentioning rom’gab (stress on the second syllable) reminds me of the whole prototonic/deuterotonic thing in OI: when a preposition or preverbal particle like the perfect “ro” combines with a verb, the stress falls on the second element; if two or three prepositions combine with a verb (OI is like that) the stress falls on the second preposition; *but* (because that would be far too simple, and this is Old Irish) there are also syntactic circumstances (imperative, archaic verb-final constructions) and various conjunctions and particles which cause the stress to go on the *first* preposition instead.

    The stress in OI marks the beginning of a more tightly integrated part of the word (the infixed pronouns I mentioned must precede it) but it’s *not* the case for example that the “prototonic” forms are always preceded by some element which is acting like the unstressed first preposition of a deuterotonic form.

    Interestingly, in imperative [preposition + verb] forms, the stress goes on the preposition *unless* there is an infixed pronoun between, when it goes on the verb.

    So – what groups of consecutive morphemes ended up being marked as an Old Irish core phonological word, by initial stress, is subject to quite complex syntactic rules. On the one hand, this is maybe another bit of evidence for wobbly word boundaries in the protolanguage; on the other, it makes it difficult to posit a simple initial stress rule already in that protolanguage.

  89. David Marjanović says:

    No time to catch up now.

    where the pronoun occurs between a preverbal preposition or particle and verb, and the whole thing is a phonological word in Old Irish.

    Reminds me of Gothic, where such things as (I forgot the actual examples) “and he begat” came out as “be-and-he-gat”.

  90. David Eddyshaw says:

    Mark 8:23 ga-u-hwa-sehwi “if he could see anything” (gasaihwan “see”)
    Mark 16:8 diz-uh-than-sat “and then settled on” (dissitan “settle on”)

    According to Wright, the verb prefix/preposition is unstressed normally, except in cases like these where it is separated from the verb, when it is stressed. (Not sure what his evidence was.) It makes sense: the prefix/preposition and its verb are treated more as two separate words when material intervenes between them, not surprisingly. Same thing in Greek tmesis.

    On the face of it, the Irish does the exact opposite: imperative
    ‘to-mil “eat” vs do-s-n-‘gniith “make them”, but the forms with the infixed pronouns actually are just following the usual rule whereby the OI verbal complex gets stressed on the second of a series of preverbal prepositions if there’s more than one, or on the verb itself if there is only one preverbal element – it’s the “prototonic” uninfixed imperative forms which deviate from the common form.

    The basic principle of the Irish system is just that the *first* of the prepositions/particles attached in front of a verb is relatively loosely bound, which is why it doesn’t have the stress on it, because that would mark it as part of a close phonological word with the following verb. It makes sense that even in the marked “prototonic” cases where, contrary to the usual rule, such prefixes *are* bound closely to the verb, if there are, nonetheless, infixed pronouns, they will disrupt this close binding and the form will default to the usual “deuterotonic” pattern.

    Simpler yet, you could think of “deuterotonic” OI forms as in fact having two stresses, one on the second element and another, weaker, on the first. Both these stresses would mark phonological words, but the first would be a proclitic *word*. Then the effect of the infixed pronoun is quite simply tmesis: it makes two words out of one, similarly to the Gothic.

    The fundamental oddity of the Irish system lies in the fact that there *are* prototonic words at all, contrary to the usual compounding rule. Quite a lot of them can be explained away by saying that the conjunction or particle in front of the prototonic form is behaving like the first preposition in a series, the only “loosely bound” one allowed; but there are other cases (imperative, replies to questions, the archaic verb-final construction) that won’t fit.

    The fact that the imperative in particular is one of the exceptions makes me wonder if originally pre-Old Irish had something like a system with one *tonally* prominent syllable not per word, but per phrase (a bit like the system in Japanese and some Basque dialects, but without different lexical individual tone patterns in words); this changed character into a stress (as in Greek) not too long before the emergence of the Old Irish we know and love, and unleashed all the segmental changes that gave the language its familiar shape.

  91. David Marjanović says:

    Mark 16:8 diz-uh-than-sat “and then settled on” (dissitan “settle on”)

    Thanks, that’s what I was thinking of. Fun fact: the German cognate (sich) zersetzen means “decay”. 🙂

  92. David Marjanović says:

    “[…] syntactically governed word-initial segmental alternations, generally infrequent but omnipresent at least in Celtic and Berber, and also heard in Nias (Malayo-Polynesian, Austronesian) and in Iwaidja and Marrgu (Iwaidjan, Australian); […]”

    My dialect turns /b/ between vowels into /v/. This seems to be a rather recent development*, and I think it’s even somewhat optional in that retaining /b/ between vowels, though rarely heard, doesn’t sound outright wrong. At the end of a word, /b/ does turn into /v/ if a clitic or perhaps any personal pronoun that begins with a vowel follows;

    Not when the pronoun is stressed, coming to think of it.

    I’ve been partial myself to the hypothesis of vowel balance as a sprachbund effect, but it would be a stronger hypothesis if the effect were strongest in the northernmost dialects.

    I suppose today’s northernmost dialects are a much younger phenomenon and are spoken far north of where the contact zone between Germanic and Sámi used to be.

    Latin stress is different from Brythonic, which was always penultimate.

    Lots of Latin words only had two syllables, and all of those were stressed on the penultimate… which was also the first syllable, making reinterpretation of the first as the penultimate easier (like in Polish, I’m sure).

    Also, did Proto-Brythonic still distinguish long and short vowels? If not, it wasn’t able to import the Latin stress rule (penultimate if it is long, antepenultimate if penultimate is short), and would most likely have chosen either the penultimate or the antepenultimate for all words; bisyllabic words would cause a bias for choosing the penultimate.


    Uh, sorry, that’s the causative. *hangs head in shame*

  93. Did Proto-Brythonic still distinguish long and short vowels?

    Gaulish certainly did. In fact, it had the same phonology as Latin, except for lacking /f/ and probably /h/ and adding the diphthong /ou/. Some varieties also had /θ/, known as the tau gallicum.

  94. David Eddyshaw says:

    “did Proto-Brythonic still distinguish long and short vowels”

    Yes. But it has no bearing on the stress rule.

    I don’t think the fixed penutimate stress of Brythonic really *needs* an explanation. It’s hardly a rare phenomenon.
    I would be more prepared to accept contact as an explanation if Brythonic had a system *exactly* like Latin.

    It’s interesting that Polish and Czech, which are close enough for a fair bit of mutual intelligibility according to my Czech colleague, differ in stress pattern pretty much as Brythonic and Goidelic. Stress just isn’t one of those features which strongly persist while other features of a language change. It’s relatively volatile.

    Welsh itself has changed since the Middle Ages, from fixed word final (after the loss of Brythonic final syllables) to fixed penultimate (again.)

  95. David Eddyshaw says:

    @David Marjanović:

    Very interesting link – thanks.

    The word initial segmental changes of Berber are not remotely like the consonant mutations of Celtic. They basically affect the *vowels*. I know nothing of the other example languages, but the Wikipedia articles on Nias and Iwaidja have something to the point.

    There’s frustratingly little in the Iwaidja one, but what there is suggests that the system developed by loss of the segmental forms of possessive prefixes, which would only need you to suppose that the possessive prefixes were closely bound to their nouns in the protolanguage rather than needing to conjure up another language in which word margins were blurry in general within phrases.

    The Nias one looks similar but with like lost case prefixes rather than possessive prefixes.

    Both would then be rather like the Fulani case, rather than Insular Celtic or SW Mande. Fascinating, though. The difference between this and IC/SWM would really be just a matter of degree: how many words in the protolanguage got treated as a phonological unit – eg just proclitic particle + noun, or more extensive groups like noun + adjective or verb + noun.

  96. David Marjanović says:

    Some varieties also had /θ/, known as the tau gallicum.

    Wikipedia prefers interpreting it as [ts]… though considering that [ks] seems to have become [xs], maybe [ts] became [θs] and then [θ]…

  97. David Eddyshaw says:

    I see that it’s possible to get hold of quite a bit on Nias, after a bit of searching. It looks very interesting indeed (no language with bilabial trills is to be treated with anything other than profound respect.) There seem to be a lot of phenomena very directly bearing on these questions – another language to add to my select club of Insular Celtic and Southwest Mande. Off to do some reading!

    (Profound thanks to David M for pointing me at it)

  98. David Eddyshaw says:

    Nias was the subject of a PhD thesis by Lea Brown at the University of Sydney, which looks extremely competently done.

    Syllables are all CV(V). Protoaustronesian syllable-final consonants have been lost. Most content words are disyllabic or trisyllabic; with few exceptions stress is on the penult, and suffixes (but not prefixes) count as part of the word in this connexion, shifting stress to the right.

    Within a phrase, just one stress is prominent, and this is the locus of intonation changes, like a pitch fall in statements and a rise in questions, occur following that syllable. This fits my general scenario, I guess, but I think its to some extent just a particularly accurate account of a very common linguistic pattern. You could adduce parallels in English, even.

    There are some signs of interword lenition of post-vocalic stops in an earlier stage of the language, notably in the distribution of b vs corresponding labiodental approximant, where the latter evidently arose initially as a postvocalic allophone but now also occurs word initially as a result of various phenomena among which seem to be inter-word sandhi in some set phrases.

    The “mutations” as such however, Brown hypothesizes as resulting from a former clitic particle n(a)-, adducing various pretty convincing bits of evidence.

    So I can fit the language to my Procrustean bed of syllable-and-phrase more than word-oriented phonology, and as a (supposedly) typical type with predominently (indeed in this case invariably) open syllables. But whereas in SW Mande, the system is atypical because syllables can end in a nasal as well as a vowel, in Nias there is a nasal which has similarly led to the rise of a grammaticalised mutation system, but in this case it was originally a separate grammatical particle which lost its independent syllabicity.

  99. David Eddyshaw says:

    Although the Berber grammatical initial segment alternations don’t involve consonant changes, there are some parallels. From Jeffrey Heath’s book on Tamashek:

    Tamashek nouns mostly begin in a vowel if masculine, tV- if feminine, and i- in the plural. These vowels get shortened or reduce to schwas when the noun is in close connexion with a preceding word, viz in compounds, after a preposition, or in the combination verb + subject (but not verb + object.)

    Tamashek has what Heath calls “accent” the phonological nature of which he doesn’t say much about, but is presumably basically stress. At any rate a syllable either has stress or not and there is only one accented syllable per word. Be that as it may, the actual behaviour of this accent is very much more like the pitch accent of Ancient Greek or Japanese. Inflected verbal forms and many noun forms do not have an intrinsic accent of their own, but are realised with a “default” accent which is basically recessive, on the antepenult if there is one, and the initial syllable if not. BUT in this later case, the accent is “unstable” and is thrown back onto the last syllable of the preceding word if the two are

    Verb + noun (subject or object in this case)
    Preverb + verb
    Preposition + noun
    Demonstrative + verb or participle in a relative construction
    Numeral + noun.

    But nouns cannot form an accentual phrase with any *following* word.

    Although the conditions for grammatical initial segmental change unhappily don’t neatly match with those that determine if a group can be an accentual unity, it’s clear enough that Tuareg quailfies as another wobbly word boundary language.

  100. David Eddyshaw says:

    Thomas Penchoen’s grammar of Tamazight implies that stress is non-contrastive in that language, though with the somewhat equivocal comment that stress “arises in syntactic groups.” Sudlow’s grammar of Burkina Faso Tamasheq doesn’t mention it at all. At least that presumably shows it isn’t very salient perceptually. Calling Dr Lameen Souag …

  101. David Eddyshaw says:


    To end up with an initial mutation system you need:

    1. A language with a particular tendency to run words together phonologically in phrases. I still think that on the whole it’s languages that *don’t* do this so much which are the exception; however there do seem to be independent reasons to think that some of the languages with grammatical initial mutations tend to the more wobbly word-boundary end of the continuum, eg for Insular Celtic the indepent development of conjugated prepositions in both branches, infixed pronouns, the peculiarities of the prototonic/deuterotonic stress patterns in Old Irish, maybe lack of evidence for word stress in older Insular Celtic.

    2. Widespread *progressive* changes of -VC- and/or -C.C- within and between words, (of a kind which nevertheless do not lead to great loss of previous distinctions, or point (4) will apply all the more strongly.) It is possible that languages of type (1) are statistically less likely to display such changes, for example because they tend to have simpler prevailing syllable types than those where grammatical words are strongly demarcated phonologically.

    3. Complete loss of the conditioning prior elements of (2) word-finally.

    4. Failure to remove the effects of (3) by analogy with the “unmutated” forms. Ideally one would like to minimise this, as it’s a bit of a get-out-of-jail-free card for explaining away the cross-linguistic rarity.

    I think the cumulative improbability of 2,3,4 is probably great enough to explain why mutation systems are rare, while still allowing for a scenario where Common Insular Celtic might have been a language of a type to strongly predispose to the later development of mutations, without invoking an explanation involving later diffusion between the daughter languages (which is hard to make plausible once you start looking at the details) or some sort of mystical “drift.”

    OK, I’ll stop now.

  102. Some more initial consonant mutations for you, from Blust’s survey of Austronesian (§ In Sika (East Flores Island), initial consonants reflect the effect of now-lost person prefixes: *kita ‘see’ > 1sg ita, 2sg gita, etc., depending on whether the prefix had a nasal consonant. In Nakanamanga (Efate, Vanuatu) verbs change their initial consonants in some grammatical contexts, which come from former phonological conditioning.

  103. David Eddyshaw says:

    Many thanks, Y. There may be a fair bit of this in Austronesian … some more reading …
    Looks similar to the lost-prefix scenario of Iwaidja, and (conjecturally) Fulani.

    Erromangoan/Sye has initial verb root mutation, like Efate. It happens after future tense subject prefixes, and in various other tense forms:

    c-aruvo “he has just sung”
    y-em-aruvo “while he was singing”
    co-naruvo “he will sing”
    c-am-naruvo “he is singing”

    Although the “mutation” seems to be basically n-accretion, it entails changes in the initial consonants in some verbs, eg mv -> mp.
    Sye has developed quite an impressive degree of synthesis in its morphology by fusing old prefixes and suffixes to the stem, and losing final syllables too, giving it a look pretty unlike what one tends to think of as “typical” Austronesian.

    All this is relevant, but most of these examples are basically word-internal, at least if you stretch “word” to include subject and possessive prefixes. MInd you, it’s a question of degree; in these languages’ earlier histories “you will see” and “my foot” will have been phonological words; in more extreme languages, “you will see my big foot” could be a phonological word too, etc.

    Reflecting a bit on the question of whether languages which are more syllable-centric than word-centric might be less likely to have lots of closed syllables, (and thus less likely to have variant reflexes of consonants after vowels and after consonants because there would be few opportunities for the contrast even to arise): difficult to demonstrate without a decent measure of how syllable-centric (or phrase-centric) vs word-centric a language actually is. And quite possibly false anyhow.

    However: obviously a language is not likely to develop grammatically significant initial mutations as a result of external sandhi if *all* words in that protolanguage end in vowels, for example. (Nias is an exception because its protolanguage seems to have had a non-syllabic particle which functioned like a single intrusive consonant.)

    More broadly, an important determinant is going to be what word-final consonants are permitted, and then what becomes word-internally of consonant-clusters with those consonants as initial components, compared with the word-internal development of single consonants (which will be post-vocalic.) So for example in Classical Greek, words end in vowels, or in -n -r or -s; word-internally clusters like -nt- -rt- -st- don’t show a significantly different -t- from when the -t- just follows a vowel. So if at that stage Greek had undergone loss of all its final syllables one day, there would still have been no basis for the language to develop an initial mutation system. On the other hand, in modern Greek mp -> mb, nt ->nd, nk -> ng, and indeed in some dialects the result is just voiced stops, contrasting with the spirant reflexes of Ancient Greek b d g: Δεν κατάλαβα ðe ga’talava “I don’t understand’; you could imagine a scenario where if Modern Greek lost final syllables a mutation system might result, though probably even then there wouldn’t be enough of the consonant system affected to form the basis of a regular system which could resist analogical pressure. (In fact this only seems to happen across word boundaries in Modern Greek when the first word is a monosyllabic proclitic, so maybe not. Istanbul!)

    Lots of languages have undergone some sort of lenition of stops following vowels word-internally (Spanish padre etc) and lots of languages have lost earlier final syllables; if that was all it took, grammatical initial mutations should be springing up all over.

    The set of languages in which words are much more phonologically salient than phrases is perhaps too small to account for the rarity of mutations. I wish I could think of some relatively objective measure of how languages rank on the scale of how far morphological words coincide with phonological words etc, but although it’s not too hard to come up with extreme examples (German vs French) it’s not easy to think of a systematic way of measuring it.

    If you’re going to get a mutation system which resists the effects of analogical levelling and restoration of the phrase-initial consonant reflexes throughout, it will need to be fairly extensive and fairly transparent. If, for example, only p t k are affected, then it seems likely that the alternation in just three consonants alone won’t be enough to set up a stable system. Again, if the consonant changes result in a lot of reflexes falling together, it will result in a system too chaotic to persist (as a reductio, imagine a system in which all postvocalic obstruents go to zero – like French but even more so!) So this will impose limits on the kinds of consonant changes that a language will need for a viable mutation system to result.

    Again, as above, for this to happen, words must end in the protolanguage in a variety of different sounds which have different effects on a following consonant, so it is unlikely to happen in a language with no word-final consonants, say.

    Maybe these two sets of constraints are enough to explain the rarity of mutation systems. Perhaps it would be useful to think of individual languages which don’t have such systems. Why hasn’t French got initial mutations? Or Spanish? How was Insular Celtic different from Vulgar Latin in this?

  104. David Marjanović says:

    without a decent measure of how syllable-centric (or phrase-centric) vs word-centric a language actually is

    I have a paper which lists a lot of word-centric features in what the authors think of as German. I’ll post the list “tomorrow” (it’s 2 am, I’m very tired) and explain how a few kinds of German differ in them. Counting such features could lead to a reasonably objective measure.

    Why hasn’t French got initial mutations?

    People do make joke spellings like bonjour les zamis… if French weren’t written, maybe it’d have a limited system now, perhaps along the lines of the n- that inflected forms of the Slavic 3rd-person pronouns get when they’re preceded by prepositions, one of which ended in -n before all syllables were reinterpreted as open.

  105. David Marjanović says:

    I found it! It’s this pdf in German. Full bibliographical information is not included, but Google found it:

    Damaris Nübling & Renata Szczepaniak (2010): Was erklärt die Diachronie für die Synchronie der deutschen Gegenwartssprache? Am Beispiel schwankender Fugenelemente. Jahrbuch für Germanistische Sprachgeschichte 1: 205–224.

    Quick summary (some of the examples are mine):

    1) OHG inserted vowels into consonant clusters to “improve syllable structure” (get closer to the CV ideal), making things easier for the speaker: swimman ~ sowimman, burg ~ burug. Early NHG did the opposite, adding consonants to mark the boundaries of phonological words by consonant clusters, making things easier for the listener: niemand, Obst, willentlich, versehentlich, namentlich, versehentlich, eigentlich. This seems to lie behind the otherwise weird and unstable distribution of unetymological -s- between the components of compound nouns.

    2) OHG i-umlaut and other things that happened to vowels were a syllable-based phenomenon, making things easier for the speaker by making the nuclei of adjacent syllables more similar.

    3) Vowel quality and length were largely independent of stress in OHG. Nowadays there are separate vowel systems for stressed and unstressed syllables: 18 vowels incl. diphthongs bear at least some stress, while only 2 ([ə], [ɐ]) are fully unstressed.

    4) Phonetic and phonological processes were syllable-based in OHG (the given example is the HG consonant shift), but are word-based today. The phonological word has been regularized towards two syllables, of which the first is stressed.

    5) Long consonants “optimize syllable boundaries” and existed in OHG, but have changed into short ambisyllabic consonants, which “form bad syllable contacts”.

    6) OHG syllables apparently generally went up the sonority hierarchy; towards Early NHG, syncope and other things (like the consonant addition described in 1)) created extrasyllabic consonants, creating “word-positional information”.

    7) Right now, in unstressed syllables, syncope is creating syllabic nasals and liquids en masse; these, too, “stand beyond optimal syllable nuclei”.

    8) Other consonant phenomena are increasingly word-based, like aspiration of plosives at the beginnings of words and neutralization at their ends.

    9) The glottal stop didn’t exist in OHG, but marks the beginning of a word today. It’s unclear how old it is; “some syllable-language dialects like Swiss German” still lack it.

    My comments:

    1) One might object that the -n+t+lich phenomenon looks like one of those globally common insertions (sr > str, mr > mbr, ns > nts…) that make things easier for the speaker. But then we’d expect /d/, not /t/. This additional “strengthening” has parallels in my dialect: Hemd, niemand and in die come out as /hemt/, /nɛɐ̯mt/ and /int/, going against the Central Bavarian lenition of medial and final /t/ to /d/; and the diminutive suffix /l̩ ~ ɐl/ not only generates plosives (Stern gets /d/), but also strengthens existing plosives (Pferd has its /d/ changed into /t/) and formerly existing ones (Lamm “lamb”) gets /p/, which is otherwise almost a loanword phoneme!) where possible (“little lamb” and “little lamp” are homophones).

    2) Umlaut is still active in Standard German, though; ProgrammProgrämmchen.

    4) The HG consonant shift “improved” many syllables, but it overshot, making others worse.

    5) Consonant length is lost in the Low and (not sure if all) Middle German dialects and the Standard German phonologies used in the corresponding regions, but it’s alive and well in the Upper German dialects* and in the Standard German phonologies used south of the White-Sausage Equator! Austrian Standard German comes close to the Standard Swedish ideal of “compensatory length” (each stressed syllable has a long vowel or a long consonant/cluster, everything unstressed is short), except that long fricatives generally remain behind long vowels/diphthongs and that there are loans with short stressed syllables (Panama, Kanada, Ebbe – long /b/ doesn’t exist) which always sound overly hasty to me.

    7) It’s interesting that the authors consider this to be occurring now. Syllabic [m̩ n̩ l̩] (and [r̩] in Alemannic dialects) seem to be fully established all over the place to me – but there’s a twist. Large parts of Germany have turned the definite article ein into [n̩], and this seems to be spreading; but within that region, many people can’t actually cope with this and are now turning it into [nən ~ nɛn]. Fascinating (and bewildering) blog thread here; one comment even says “‘n’ [as a word] is hard to say, doesn’t have a vowel after all 😉 “, while others say they only know nen from the Internet, in writing.

    8) Aspiration has hardly made it out of (formerly) Low-German-speaking areas. The whole point of the HG consonant shift was to get rid of it. I was explicitly taught to aspirate in one of my first English lessons. Final fortition is a northwestern* phenomenon (for a broad definition of “Northwest”), and it isn’t word-final at all, it’s syllable-final: I’ve heard Sydney [ˈzɪtni] and Simbabwe [zɨmˈbapvɛ] (with a remarkable [pv]).

    9) Except after a pause (where it’s optional even in French), the glottal stop doesn’t exist in the whole south. Farther north, it’s a syllable-based feature: it is inserted in front of stressed syllables that would otherwise begin with a vowel – Na[ˈʔ]omi, Jo[ˈʔ]achim, [ˌʔ]Astero[ˈʔ]iden und Kometen (note lack of [ʔ] before unstressed und). Incidentally, it’s really weird to call Swiss German a single dialect.

    * The exceptions, as usual, are spoken in Carinthia, where the whole sound system was reinterpreted in Slovene terms.

  106. David Marjanović says:


    2) Umlaut is still active in Standard German, though; Programm – Progrämmchen.

    But that’s morphological generalization, not a phonologically predictable process anymore.

  107. David Marjanović says:

    To point 4 above, on p. 69 of this paper by Nübling gives examples of OHG and earlier sound shifts that supposedly brought syllables closer to CV (the West Germanic consonant lengthening, i-umlaut, vowel harmony of epenthetic vowels, occurrence of epenthetic vowels and consonants, assimilations, Notker’s law of initials) or “at least operate in a syllable-oriented way” (“the entire” HG consonant shift). I haven’t yet figured out what she means by the HG consonant shift happening in a syllable-oriented way.

  108. David Eddyshaw says:

    Thanks, David M. That’s thought-provoking …

    I suppose a syllable-centric language is one in which there is at it were a democracy of syllables, rather than a particularly privileged syllable or syllables within each word (or phrase … it occurs to me that syllable-based is *not* synonymous with phrase-based.) The obvious way for a syllable to be privileged is by a suprasegmental like tone or stress; but it could be segmental. For example if there is a different set of possibilies for word-final consonants than for syllable-final consonants in general, or if clusters are permitted across word boundaries than may not occur word-internally (or vice versa.) Or if certain syllables show a greater or lesser range of permitted vowels than others.

    Features like these are likely to be associated with stress etc but there is no *necessary* correlation. In fact I can think of a case where there was imerfect synchronic correlation, in Menomini, where there is a system of vowel lengthening in open stressed syllables, but in which the pattern no longer works in a way which fully correlates with stress, partly because of loanwords but also because, for example, pronouns and vocative forms do not participate. Bloomfield calls these types “static” words in his grammar.

    Interesting thought about long consonants vs ambisyllabic short consonants.

    Umlaut strikes me as almost by definition a word-level phenomenon (like its big brother vowel harmony.) I suppose you could characterise it in the OHG context not so much as a feature of a syllable-based language but as a symptom of a change from syllable to word centric. Welsh, too, shows extensive umlaut phenomena, eg

    caraf “I love” ceri “you love” …
    castell “castle” pl cestyll
    bardd “poet” pl beirdd

    (The equivalent in Irish would be more the development of the plain/palatal consonant distinction.)

    Come to think of it, an initial consonant mutation system also must be, *not* a feature of a language with blurry word boundaries, but the result of a *change* in character of a language from syllable- or phrase-centric to word-centric. (A language with word-initial mutations is practically by that very token word-centric – the beginning of words is highly marked, at least the beginning of full open-class words like nouns and adjectives and verbs.)

    The synchronic word-blurriness of French, therefore, is *not* a reason to expect it to have mutations: though a more word-centric language descended from French might!

    The *acquisition* of a strong word-stress system in a language in which it was previously *absent* (or weak) might well be important, and this could well be part of what has been going on in Insular Celtic, of course.

    For a mutation system to develop, there is obviously an ordering constraint: the change to a language in which words are more discrete phonologically must happen after the development of the widespread allophonic consonant changes, and the loss of the conditioning word-final elements must follow the change to discrete-word type (or how do they “know” they’re word-final?)

    *Change* is a major consideration which I hadn’t thought about properly.

    This also could shed light on the paradox that many languages are phrase- or syllable- centric yet few have grammatical initial mutations: the latter are a feature of systems which have *changed* from syllable- or -phrase- centric to word-centric, or at least *more* word-centric, *after* developing a system of extensive allophony of consonants depending on whether they are preceded by vowels or nasals or whatever.

    For the Insular Celtic languages this makes sense. I’ll have to think about SW Mande and Nias (and Tuareg) a bit.

  109. David Eddyshaw says:

    Modern Welsh is actually an excellent example of a language with an imperfect correlation between stress and “segmentally privileged” syllables (don’t know why I thought of Menomini when Welsh was under my nose.)

    Since Middle Welsh the stress has shifted from the ultima to the penult; but a range of sound changes still operate which privilege the final syllable, not the penult. One result which confronts every foreign learner at the outset of his studies is the pronunciation of the letter “y”, which is like Russian ы in final syllables but schwa-like everywhere else (and in proclitics like “yn” “in” and the definite article “y”.)

    This is changing further, though, with widespread loss of older vowel distinctions (still represented in the orthography) in unstressed final syllables.

  110. David Marjanović says:

    Welsh, too, shows extensive umlaut phenomena

    …I had no idea.

  111. David Eddyshaw says:

    With Nias, the development of the approximant version of “b” from postvocalic allophone to independent phoneme, partly as a result of interword sandhi (as I mentioned) is on reflexion evidence for a change *from* a less word-centric protolanguage; if there were no change, why would there have been a reanalysis of the initial consonant of the second word?

    But for Nias in general, I probably want to retreat to my original position that the mutations just represent the effects of a proclitic particle which lost syllabicity and then all independent segmental representation, as with the pronouns in the Australian example and presumably with Fulani plurals; this need not reflect any more widespread unusual fuzziness of word boundaries in the protolanguage.

    Looking again at the Zialo grammar, it describes the fundamental element of prosody as a metric foot of one or two syllables; in the disyllabic types the inventory of medial consonants is far poorer than foot-initially; indigenous word roots are usually of one metric foot. So the segmental evidence is in fact much against contemporary Zialo being syllable-centric, and I think this applies to Mande in general. So variation along this axis could not explain why SW Mande has mutations and Mandinka (say) does not, [although it is true that the suprasegmentals are different, with Mandinka having a word-based tone system rather than syllable based like Zialo.]

    It’s a lot simpler than my earlier fanciful contrast: Mandinka has *not* lost syllable-final nasals in general, and hasn’t developed any lenition of post-vocalic consonants either (between feet), so any supposed contrast with SW Mande is irrelevant. The ingredients for a mutation system just aren’t there in Mandinka.

    The lack of a comparator means that it’s just speculation that proto-SW Mande was not word-centric, though I suppose you could say that this is probable if the metric foot is the basic prosodic unit in Mande generally; it should mean that any Mande language that developed different reflexes of consonants after vowels and after the nasal and then lost syllable-final nasals might be expected to develop mutations.

    In fact this is potentially falsifiable: if there are Mande languages that have developed different reflexes of consonants after vowels and nasals, and have lost syllable final nasals, they should show some evidence of a mutation system unless there is independent evidence that they have also lost the metrical foot as the basic prosodic unit.

  112. David Eddyshaw says:

    The Insular Celtic languages have word marking stress which is basically predictable from the segmental forms. This is what you might expect if a language has comparatively recently *developed* word marking stress.

    On the other hand, an unpredictable stress system must be inherited (sometimes from a protolanguage in which it was predictable, cf Spanish and Latin.) And if the protolanguage had word stress, then a mutation system is unlikely to arise, because a change of type from phrase-marking to word marking hasn’t happened. This can probably be generalised to pitch-accent or any suprasegmental word-marking feature.


    If a language has a suprasegmental word marking feature which is not (in general) predictable from the segmental structure by rule, the language cannot have an initial mutation system.

    This is not a terribly bold prediction given the rarity of mutation systems, but, again, it’s potentially falsifiable.

  113. David Eddyshaw says:

    Tuareg is a counterexample if the vowel changes of noun initial syllables are regarded as “mutations.” The suprasegmental “accent” feature that Heath describes is predictable in finite verbs but not nouns. On the other hand, the rules I described above show that Tuareg has fuzzy word boundaries (as far as accent is concerned) *currently* rather than implying that it has developed from an earlier Berber language with even fuzzier word boundaries. Moreover although the initial-syllable changes may quite possibly have arisen historically as sandhi phenomena of some sort they have no obvious word-internal analogue as far as I can see, unlike the consonant mutations in the other languages. If it arose as (say) a shortening of vowels in the syllable following a stress, that would necessitate a former distribution of the accents unlike the current one.

  114. David Eddyshaw says:

    “an unpredictable stress system must be inherited”

    … as an English speaker I should perhaps be more careful of sweeping generalisations. “Must be inherited unless wholesale borrowing of foreign vocabulary has messed up all the patterns …”

  115. David Eddyshaw says:

    Nivkh seems to have unpredictable stress and/or tone (very little about it in the Gruzdeva sketch) but I think we decided between us that Nivkh doesn’t have a mutation system in the sense we’ve been discussing.

  116. David Marjanović says:

    I keep forgetting: here is the predicted Germanic word language with a huge vowel phoneme inventory (22 monophthongs, 13 diphthongs – not counting the 10 combinations of a vowel followed by /j/ or /β̞/), lots of monosyllabic words, and a pitch-accent system with two tones (which often differentiate the singular of nouns from the plural). It’s a Limburgish dialect spoken right next to the northeastern border of Belgium.

  117. David Marjanović says:

    And something else to ponder about syllable vs. word languages:

    Non-rhotic German, generally speaking, has short and long diphthongs that end in [ɐ̯]; they form from /r/ preceded by a short or a long vowel. In the north, but not in the south, long diphthongs only form when the /r/ belongs to the same syllable as the preceding vowel and therefore disappears in the process; this prevents overlong syllables from forming – a syllable-language feature –, but also marks the end of a phonological word (or anyway its last syllable) – a word-language feature.

    Example: TorTore, singular and plural of “gate” and “goal in football, handball and hockey”.
    North: [tʰoːɐ̯] – [ˈtʰoːʁɵ]
    South: [toːɐ̯] – [ˈtoːɐ̯ʀɛ]

    (The southern version sort of looks as if its /r/ in the plural were intrusive; and indeed, the cases of intrusive /r/ I know of are all southern. But I know way too little comparative German to calculate a confidence interval on this finding. 🙂 )

  118. Here is the promised section on Gaulish stress from Pierre-Yves Lambert, “La Langue Gauloise”, Éditions Errance, Paris, 2003. It’s a bit different from what I remembered – I read that book about 10 years ago. I start with the French text (p. 48), then give an English translation (did it myself, so please point out any mistakes), and a few comments.

    L’accent en gaulois

    Nous n’avons pas beaucoup d’indices sur l’accent gaulois; quelques formes du latin de Gaule ont un comportement spécial. On a depuis longtemps relevé les deux traitements que présentent les noms des cités gauloises, un accent antépénultième donne Rennes, Bourges, et l’accent pénultième donne Redon, Berry.
    Bitúriges > Bourges
    Bituríges > Berry
    Ainsi Nemausus donne (accent pénultième) Nemours, mais avec accent sur l’antépénultième, Nîmes; Condate donne Condes où Condé, Arelate donee Arles ou Arlet.
    Autres exemples de formes accentuées sur l’antépénultième: Caturiges > Chorges, Cambo-ritum (« le gué courbé ») Chambord, Eburovices Evreux, Durocasses Dreux, Bodiocasses Bayeux…
    En fait, les formes avec accent antépénultième ne sont pas celles qui posent problème : on en avait aussi en latin et même en latin tardif (ex. : hóminem > homme). Le problème est de savoir pourquoi certains de ces mots sont devenus accentués sur la pénultième, avec allongement de la voyelle pénultième (Cóndate > Condáte > Condāte > Condé). Il n’est pas sûr que le phénomène remonte vraiment au gaulois : cela peut être dû à des disparités socio-linguistiques dans la société gallo-romaine.

    Stress in Gaulish

    We don’t have many clues about Gaulish stress; some forms of the Latin of Gaul have a special behavior. One has long noted the two treatments that the names of Gaulish cities present, an antepenultimate stress gives Rennes, Bourges, and the penultimate stress gives Redon, Berry.
    Bitúriges > Bourges
    Bituríges > Berry
    Thus Nemausus gives (penultimate stress) Nemours, but with stress on the antepenultimate, Nîmes; Condate gives Condes or Condé, Arelate gives Arles or Arlet.
    Other examples of forms stressed on the antepenultimate: Caturiges > Chorges, Cambo-ritum (“the curved ford”) Chambord, Eburovices Evreux, Durocasses Dreux, Bodiocasses Bayeux…
    Actually, the forms with antepenultimate accent are not those that pose a problem: those were there also in Latin and even in Late Latin (e.g.: hóminem > homme). The problem is to know why some of these words have become accented on the penultimate, with lengthening of the penultimate vowel (Cóndate > Condáte > Condāte > Condé). It is not certain that the phenomenon really goes back to Gaulish: this may be due to socio-linguistic disparities in Gallo-Roman society

    While it is certainly true that Latin knew stress on the antepenultimate, this was only true for words where the penultimate was short. As length was not normally indicated in Latin writing, we have partially have to rely on the stress indicated by the modern forms of the names or on etymology to establish Gaulish vowel length; in this case, relying on the modern stress can become a circular argument. But there ought to be no doubt that names like Nemausus or the names in –casses ought to have penultimate stress in accordance with Latin rules, so the antepenultimate stress indicated by some of the modern French names needs to be explained. I’ve also generally seen the “i” in –riges described as long; again, that would demand penultimate accent according to Latin rules. Of course, as the authors state, the indicated stress may not be the Gaulish stress but due to some differences between Gallo-Roman and Standard Latin; that we’ll probably never know.
    As for the point that started the discussion: forms like Némausus and Cóndate could also indicate word-initial stress, but most forms with more than three syllable rule that out. The only exception is Arelate, where forms like Arles actually don’t indicate antepenultimate stress, but initial (or ante-antepenultimate) stress, something that Lambert seems not to notice. So perhaps Gaulish didn’t have a fixed stress, or prefixed nouns (prefix are-) behaved differently?
    I’ll also post this at my blog, for future reference.

  119. The Greek equivalent of Nemausus was Νέμαυσος, with antepenultimate accent (as one would expect), and I’m guessing that’s the source for that name, since the city is in the Greek-colonized part of France.

  120. There were at least two places called Nemausus in Gaul, Nîmes in the area of Greek colonisation and Nemours (showing the penultimate Accent vaiant) in the Île de France, far from any Greek colonies. The name is supposed to be Celtic, and the question would be if the Greek stress in Νέμαυσος renders the Gaulish stress or the Greek just put the stress where it pleased them best, In accordance with Greek rules, the stress could have been on any of the three syllables, so it’s at least possible that Νέμαυσος reflects what the Greeks heard from their Gaulish neighbours.

  121. In accordance with Greek rules, the stress could have been on any of the three syllables

    True, of course, but there are patterns, and I think anyone familiar with Ancient Greek would be surprised if that particular word-shape showed up with penult accent. I agree it’s entirely possible that Νέμαυσος reflects what the Greeks heard from their Gaulish neighbors.

  122. The Romans resolutely ignored Greek stress: consider Αλέξανδρος > Alexánder. So it doesn’t surprise me that the Greeks would ignore Gaulish stress either.

  123. Apples and oranges. The Romans had no choice but to ignore Greek accent (not stress); the stress on Latin words is fixed by an iron law. The Greeks, having free accent (on any of the final three syllables), could reproduce a foreign pattern if they noticed it and felt like reproducing it.

  124. But as you say, Greek stress isn’t entirely free, though what the rules or tendencies might be, I have no idea.

  125. @ John Cowan – it’s relatively easy – Greek accent can go on any of the last three syllables if the last syllable is short, and on the last two syllables if the last syllable is long. Verbs retract the accent as far back as is possible under these rules, but nouns are free to have the accent on any of the admissible syllables, so in theory it could be Νέμαυσος, Νεμαῦσος, or Νεμαυσός.

  126. The Classical Greek accent was pitch, not stress. It changed sometime in the Roman period I think.

    And mustn’t the Romans have rendered */bou’dika:/ as Boudicca with double c in order to keep the stress in the same place? We know it wasn’t geminated in Old British because the Welsh is Buddug, not *Budduch.

  127. And mustn’t the Romans have rendered */bou’dika:/ as Boudicca with double c in order to keep the stress in the same place?

    An excellent point, and I don’t remember seeing it before.

  128. Hans: Sure, I understand that much. But Hat said “there are patterns, and I think anyone familiar with Ancient Greek would be surprised if that particular word-shape showed up with penult accent.” I wanted to know what those patterns were, and why it’s surprising.

  129. I can’t tell you what the patterns are; all I can tell you is that after a great deal of exposure to the language I have a strong sense that of the three potential forms, Νέμαυσος, Νεμαῦσος, or Νεμαυσός, the second is by far the least likely. Similarly, after long immersion in Russian I usually have a good sense of where the stress is likely to fall in a surname, but I couldn’t tell you why.

  130. Is Νέμαυσος even possible? It appears to violate the Dreimorengesetz.

  131. … so in theory it could be Νέμαυσος, Νεμαῦσος, or Νεμαυσός.

    All these three are indeed perfectly possible. The only restriction here is that if the penult is accented, it must carry a circumflex, not an acute, because the last syllable has a short vowel. This rules out *!*Νεμαύσος. Conversely, a long vowel in the final syllable would require an acute on the penult (if accented).

    Despite the general tendency to retract the accent in Greek proper names, those -(σ)σος are often oxytone: Παρνασσός is a good example.

  132. John: The Dreimorengesetz only applies to penult accent. Proparoxytones can’t have a circumflex accent, but a long penult doesn’t prevent accent retraction. There are plenty of nouns accented like ἄνθρωπος.

  133. Come to think of it, the least likely accent pattern (at least in Attic) is Νεμαῦσος, since original circumflexed paroxytones with a light antepenult underwent an accent retraction known as Vendryes’ Law:

    Common Greek ἐρῆμος > Younger Attic ἔρημος

  134. Hats off for Hat’s Sprachgefühl!

  135. Oh, no, it’s Hat’s γλωσσονόησις (a Vendryes effect intended).

  136. David Marjanović says:

    The Classical Greek accent was pitch, not stress.

    Of course it was stress; and, like in most languages, the stressed syllable got a higher pitch by default. The only complication is that stressed long vowels and diphthongs could have the expected high pitch, but also a falling pitch (as if interpreted as a sequence of a stressed and an unstressed vowel, rather than as a single phoneme); this was phonemic (there were words differing only in this feature), and that made Ancient Greek a (somewhat marginal) pitch-accent language not unlike modern Norwegian, Swedish, Japanese and Shanghainese.

  137. “Of course”? Mighty strong words, considering that every source I’ve read says the opposite. A trawl through Google Books provides this representative quote: “The predominant accent of classical Greek was one of pitch rather than one of stress (until about the fourth century C.E., it had probably become a stress accent like that of Modern Greek).” I haven’t found a single source so far that says it had a stress accent. How exactly would that work, since the meter of poetry depends on an entirely different ictus pattern? Do you envision two different kinds of stress, one reserved for poetry? Inquiring minds want to know.

  138. David Marjanović says:

    I haven’t found a single source so far that says it had a stress accent.

    They all assume a false dichotomy: is there any language where, by default (intonation can override everything), stressed syllables are really just louder than unstressed ones and don’t receive any different pitch? I’m not aware of any. Even in French, where stress isn’t even a word-level feature but predictably goes at the end of whole… phrases or utterances or something, the stressed syllables receive a higher pitch by default. Modern Greek, too, makes stressed syllables higher than unstressed ones by default; Swiss German is one of the few exceptions that use a lower one instead. In other words, all languages I’ve heard are pitch-accent languages in this sense.

    Ancient Greek had two kinds of stressed long syllables that differed in unpredictable, phonemic pitch contour. This makes it a pitch-accent language in this sense. It does not mean it didn’t have stressed syllables.

    (The pitch of short syllables was entirely predictable: high by default, spelled out with the acute accent, and low when the word as a whole was relatively unstressed within a phrase, spelled out with the grave accent.)

    the meter of poetry depends on an entirely different ictus pattern

    I thought poetry works like in Latin, where stress is summarily ignored and only length counts?

    In Latin, of course, stress wasn’t phonemic, so no information was lost by this poetic convention; the same holds for Hungarian (where hexameter has caught on) and French (where the work of Georges Brassens is not considered inherently ridiculous in spite of sometimes stressing otherwise silent letters!). But Serbocroatian is a full-blown pitch-accent language with four different kinds of stressed (or, sometimes, formerly stressed) syllables, and yet at least one traditional nursery rhyme and at least one almost traditional kolo happily stress the same word two different ways in the same stanza – unthinkable in German or English or AFAIK Spanish.

  139. is there any language where, by default (intonation can override everything), stressed syllables are really just louder than unstressed ones and don’t receive any different pitch?

    American English, certainly.

  140. David Marjanović says:

    I don’t think so. But I’ll be in the US in about a month and a half…

  141. I thought poetry works like in Latin, where stress is summarily ignored and only length counts?

    You mean: pitch is summarily ignored and only length counts. I continue to find your rejection of pitch as what is denoted by the Greek accent system bizarre and unconvincing.

  142. David Marjanović says:

    I continue to find your rejection of pitch as what is denoted by the Greek accent system bizarre and unconvincing.

    …I said no such thing. I’m saying that pitch was predictable from stress for short vowels. I’m well aware that, when the accent system was invented, an accent was placed on every vowel letter (or the last of a diphthong), and the removal of – always grave – accents from unstressed syllables came later.

  143. My apologies for misinterpreting your position, but when you start off responding to “The Classical Greek accent was pitch, not stress” with “Of course it was stress,” you’re sort of asking for misinterpretation. “Well, actually it was a combination” might have been clearer.

  144. marie-lucie says:

    David: French (where the work of Georges Brassens is not considered inherently ridiculous in spite of sometimes stressing otherwise silent letters!)

    A few relevant points:
    – Georges Brassens was from the Midi (Southern France), on the Mediterranean, where the local French reveals its Occitan substrate and the “silent letters” (namely the schwa) of Standard French are usually pronounced. Even though his Southern accent was not very strong (and had probably weakened from the years he spent in Paris), it was still recognizable, and part of it was his habit of pronouncing more schwas than most Northerners.
    – Even in “Parisian” French, speakers have all attended school where teachers (especially in the early grades) spoke slowly and emphasized the schwa under conditions when it could be pronounced in formal, especially poetic, register.
    – The rules for song words are less strict than the ones for classical verse, since they must adapt to the music (and the opposite is also true).

    Brassens, a poet as much as a songwriter, was very skillful in playing with the expanded possibilities that his non-Parisian pronunciation afforded him.

  145. David Marjanović says:

    My apologies for misinterpreting your position, but when you start off responding to “The Classical Greek accent was pitch, not stress” with “Of course it was stress,” you’re sort of asking for misinterpretation. “Well, actually it was a combination” might have been clearer.

    Sorry. I was trying to respond to the position that “not stress” is even a thing.


  1. […] Hat notes the conflict between traditional and vernacular registers of the Japanese language in the 19th […]

Speak Your Mind