FRAMEWORK-FREE LINGUISTICS.

A reader sent me a link to Martin Haspelmath’s 2008 paper “Framework-free grammatical theory” (pdf), which appeared last year in The Oxford Handbook of Linguistic Analysis (a steal at under $300!). Having been out of the field for decades, I have no idea how his ideas fit into what professional linguists are now doing and thinking, but I like them very much; they echo what I’ve been saying since my grad school days:

Most linguists seem to agree that we should approach any language without prejudice and describe it in its own terms, non-aprioristically, overcoming possible biases from our native language, from the model of a prestige language (such as Latin or English), or from an influential research tradition (such as that of Donatus’s Latin grammar, or Chomsky’s generative grammar). I argue that this is absolutely essential if we want to come even close to doing justice to our research object, and that moreover any grammatical framework is precisely such a “prejudice” that we want to avoid. Frameworks set up expectations about what phenomena languages should, can and cannot have, and once a framework has been adopted, it is hard to free oneself from the perspective and the constraints imposed by it. What we need instead is the researcher’s ability to discover completely new, unexpected phenomena, to detect previously unsuspected connections between phenomena, and to be guided solely by the data and one’s own thinking.

I would be most interested in the reactions of any linguists in the crowd (and, of course, in those of others as well).

Comments

  1. I had Mathew Dryer for Typology and Syntax in grad school. He used the term “basic theory” and it’s essentially the same as Haspelmath’s “Framework free” idea. I definitely preferred it pedagogically. I would not have been happy to be constrained by a theory when learning linguistics.
    It’s also worth noting that computational linguists have always been essentially framework-free too. The same could be said of phonetics, pragmatics, most of linguistics really.

  2. I’m delighted to hear it!

  3. The same could be said of phonetics, pragmatics, most of linguistics really.
    Just say what we all think, Chris: it’s the syntacticians, they messed it all up 🙂
    I read this paper a while ago and my first reaction was relief: my training in linguistics (such as it was) did make a point of avoiding any particular theoretical framework, but I constantly found myself the odd man out, a heathen among fervent believers.
    That quote by Givón in the conclusion is right on the money.

  4. I don’t know, papers like this always irk me a bit. The author seems to act like every theoretical linguist is doomed because they have adopted a theory to study. But look at how much syntactic theory (even sticking to the Chomskyan tradition) has changed in the past 40 years. And why has it changed? Because of new evidence from new languages. I think that theory-centric work, with an eye on comparative and typological data, is very enlightening. If the field as a whole goes to a “framework-free” mode, then soon enough, people will start to see patterns cross-linguistically, and new theories will emerge to explain these patterns. The author hints at this–he says after reading 20 “framework-free” grammars, you will begin to see patterns. Well, shouldn’t something be done to explain these patterns? As a linguist I say yes, and theoretical models are a fine choice for that purpose.
    I do agree that some people do get married to a theory and are too close-minded to properly analyze new and interesting data, but most aren’t, and the theory will change based on this new evidence.

  5. Also, to add some more information, I think the “framework-free” model is very good, if not essential, for describing new languages and writing grammars. But that doesn’t mean theoretical frameworks can’t then be applied to this data. (Or that this data can’t then mold theoretical frameworks.)

  6. I (a linguist and specifically a syntactician) find these arguments unpersuasive. The article proceeds by pointing out flaws of various “framework” analyses, but the proposed replacements are no better. In fact, they recapitulate the deficiencies that led to generative grammar theories in the first place. (Why does the German prefield have one constituent in most circumstances and zero in polar questions, and not 3 usually but sometimes 2? Why does Tagalog show signs of successive-cyclic wh-movement, if embedded clauses are just determiner + verb + arguments? Etc.)
    I cannot imagine a physicist publishing an article saying, “We have no chance of discovering the Higgs Boson, and even if we did it would not solve the problem of where gravity comes from. So we ought to close CERN and go back to tabulating meticulous measurements of planetary motion.” And it seems that this kind of retrenchment is what is proposed for linguistics.
    That said, I think it is interesting to compare syntax and semantics in this regard. Chomsky is still with us and has (for better or for worse) been the guiding light of the field for over 50 years. Semantics’ analogue of Chomsky (Montague) died before he could revise his theories once, let alone 3 or 4 times. And, while semantics has its few malcontents, the central tenets of Montagovian thought are accepted to a large degree. I am less familiar with the phonology literature, but my impression is that they behaved mostly like semanticists until the introduction of OT, and now behave relatively more like syntacticians.

  7. 1. It’s easy to see how Haspelmath’s work leads into the “framework-free” approach. One of his main projects is being an editors for the World Atlas of Language Structures (also featured earlier on languagehat). It’s easy to see limitations in aprioristic theories when you’re searching precisely for the most “weird” features of a large number of languages.
    2. For a more specific example see this paper where he challenges the notion of “word” (therefore the morphology/syntax distinction) as a linguistic universal. Of course, if you’re disputing words you might as well deny word classes.
    3. I’m still starting with linguistic historiography, but if we’re wont to give -isms to everyone, could we call Haspelmath (to his chagrin) a neo-descriptivist?
    4. What I’m most curious about this paper is the claim that “there are many linguists who carry out theoretical research on grammar but do not work within a theoretical framework”. This is something that’s simply not mentioned in my education; from what we learn one gets the impression that either you spend your life polishing the minutiæ of Chomskian deep structures and transformations, or you aren’t a real linguist. (And likewise, any utterance that doesn’t fit the current generative model of a given language is “performance” and should be ignored—often the very utterances I find most interesting.)
    For example, I was interested in a better linguistic understanding of Japanese than that of gakkō bunpō, and Jim Breen kindly pointed me to Martin’s reference grammar. My generativist friend quickly dismissed it because it’s framework-free. I’ve been browsing the thing on occasion (it’s huge) and it’s easily the most interesting & illuminating thing I’ve ever read on Japanese. Why didn’t anyone tell me people can do work like that?
    I don’t mind framework-bound research; I do mind the attitude that our particular framework is the correct one and therefore all problems outside its area of interest are nonproblems.
    5. Haspelmath: «If there are no frameworks, then what should I teach my students in syntax classes? My answer is: The best syntax class is a field methods course, and the second best syntax class is a typology course. If we want to understand the nature of syntax, we have to study the syntactic patterns of concrete languages, preferably unfamiliar languages, to broaden our horizons.»
    As a student, all I can say is I’d love if the local faculty shared this view.

  8. For a non-Chomskian example of what I’m saying about restricting oneself to the problems that interest a framework, consider Berlin-Kay’s hypothesis on color naming. Notice how eagerly Lakoff relies on it to support cognitivist principles, then see the way the hypothesis was widely contested by people working with particular languages; like Lyons citing an anthropologist specialized on Hanuno’o, McNeill citing native Japanese sources on color terminology, and so on (just search around for Berlin-Kay). My point is that it’s too easy to not see or dismiss “quirky” anomalous facts when you’re interested in validating a paradigm; therefore, having fact-oriented, framework-defying researchers around is healthy and should be encouraged—particularly so in a science like linguistics, where (unlike physics or medicine) a dozen paradigms coexist without clear empirical evidence of what’s the best one.

  9. This is something often discussed in relation to linguistic fieldwork and documentation. One of my favourite papers on the topic is David Gil’s ‘Escaping Eurocentrism: fieldwork as a process of unlearning’, in Newman and Ratliff’s excellent ‘Lingusitic Fieldwork.’ (Daniel Hieber has a nice summary of the topic here http://danielhieber.com/2011/08/17/escaping-eurocentrism-in-language/)
    Regardless of one’s theoretical position, if you start working with a new language and expect to find similar patterns as in other languages you’re going to throw yourself off at some point!

  10. I’ve been out of the field for about as long as Hat, but I did find this a little odd (from Nick’s comment):
    And why has it changed? Because of new evidence from new languages.
    My feeling about early Generative Grammar is that it changed because people kept getting more theoretically ambitious in describing English, not in describing other languages. The one that sticks in my mind to this day was the proposal to derive ‘kill’ from ’cause to die’, which seems to have been from Lakoff. It was a completely goofy suggestion and was based solely on English. Even the most basic evidence from foreign languages should have been enough to scotch it before it was published.
    Perhaps I’m thrashing a very long-dead horse, but this is the sort of thing that put me off generative grammar. I wanted to learn about language, not about linguists with Lego sets.

  11. Is a ‘framework’ what I would call a ‘model’? If it is, I’m strongly on the side of having one, or, at least, seeking one. Here’s a few reasons:
    1) The real choice is not between having a model and not having a model–it’s between having a model you’re aware of and having a model you’re not aware of. I’ll add the obvious disclaimer– one should hold a critical attitude towards any model. And, by the way, pure empiricism is a model and it doesn’t work.
    2) There’s a distinction between models and heuristics, and one needs to be aware of both. If you don’t have a model, then heuristics aren’t even in your field-of-view. An example of a heuristic is “Make the strongest possible assumption that doesn’t lead to a contradiction.”
    3) To me, having a model means being aware of the history and happenstance in your field of study. Now, I know that for some people having a model means imagining that people who disagree are consigned to an auto-da-fe. That’s -not- what I’m talking about.

  12. Aaron: It seems to me rather that The Physics Analogy in this case would be something like: Our best model of reality can’t explain things like dark matter or the accelerating universe, and we have trouble reconciling quantum models with gravity, so we should encourage a healthy skepticism of the current paradigm, probe its limitations experimentally, and actively search for alternative theories. Which are all things physicists do.
    What’s more, they are exceedingly cautious about deep abstractions that get too far from falsifiability (so that string theory is widely distrusted). Physicists can be noncommittal to the extreme; while Feynman has once brilliantly explained electromagnetic fields as “made of” photons, current physicists prefer to say something vague about how electric fields “can be described by”“virtual”photons (despite the empirical fact that photons are the carrier particle of the electromagnetic energy). Their point (I suppose) is keeping in mind Korzybski’s “the map is not the territory”.

  13. It sounds to me like the main gripe here is not “frameworks”, but “bad science”. It’s important to keep an open mind when looking at data, and to be willing to admit that you’re wrong if that’s what the data implies. I would guess that most sciences have had periods where they weren’t so good at this (certainly the behavioral sciences have), and I know Linguistics has had its share.
    But as Nick says, don’t we want to explain the patterns we find? What is that if not a framework or a theory? (Can you tell that I’m still an idealistic young grad student, yet? If not, you are hereby notified. 🙂 )
    That said, I’m not studying syntactic theory, but the acquisition of (first language) syntax. One of the reasons for that is that the tweaking of particular transformations to account for particular sentences started to seem misguided to me – and definitely didn’t inspire me as a possible career path. To me, what’s amazing about syntax is that you can get so far with reasonably simple, mathematical structures: say, binary trees, a handful of grammatical categories, limited transformations* and some assumed lexical knowledge about how particular words behave. Sure it doesn’t explain every quirk of language, but why should language be so structured in the first place? Why don’t we just string together content words and hope context will indicate their relationship? Context is a powerful cue. Is it something about how we learn language? Something about how our brains process it when we use it as adults? (This is why I get excited about Psycholinguistics)
    *Ok, transformations are extremely powerful, but still, you can limit them pretty severely, and still describe a lot.

  14. @Nick,
    after reading 20 “framework-free” grammars, you will begin to see patterns. Well, shouldn’t something be done to explain these patterns?
    AFTER is the operative word here and the whole point. You first study the subject and then make theories.
    @Aaron,
    The article proceeds by pointing out flaws of various “framework” analyses, but the proposed replacements are no better.
    Really? This is what Haspelmath suggests:
    “What we need instead is the researcher’s ability to discover completely new, unexpected
    phenomena, to detect previously unsuspected connections between phenomena, and to be
    guided solely by the data and one’s own thinking.”
    How’s that not better that a blind adherence to whatever theoretical model de jour one subscribes to?
    And if you really want a comparison with physics, how about “Say, fellas, that phlogiston thing really doesn’t work when you think about…”?

  15. J. W. Brewer says:

    It seems wrong a priori (as it were) to condemn using a “prestigious research tradition” as a starting point for the sorting/interpretation of data from a hitherto unanalyzed or underanalyzed language. If Sprachwissenschaft were doing well as an empirical science, a research tradition would be viewed as prestigious precisely because it was a demonstrably useful tool in such situations. Now, perhaps the real problem is that the actual “frameworks” currently on offer aren’t actually very good, and have acquired their prestige for other reasons. But otherwise this seems like an argument that a botanist should describe a newly-discovered flower “on its own terms” without reference to any preexisting knowledge base of accumulated understandings of how previously-studied flowers typically work and what sorts of variations over what ranges have previously been observed. That seems like some sort of weird ascetic desideratum that would entail a lot of wheel-reinvention. And of course it’s difficult to do useful typological/comparative work if each language has been described by researchers who deliberately placed themselves in some sort of pristine isolation chamber.

  16. My problem with this view is that it’s essentially meaningless. *Everything* in linguistic analysis is a theoretical construct, from the way we write the examples (IPA anyone? Phonemes?) to words to paradigms to discourse. I have no problem with expectations about what languages can and cannot have, though I would prefer the term ‘hypotheses’ — hypotheses are testable, and recognising good hypotheses and testing them is important. But you can’t do that unless you know a lot about how we think language works.

  17. I know nothing of linguistics but I do know a bit about mathematical modelling. Several times in the past I have cut off someone who was saying something to the effect of “Models are all very well, but in the real world…” by remarking that “the real world” was one of the most unsatisfactory models around. I think this places me with Claire, and with Mattf’s point (1).
    Mind you, I can imagine that Linguistics might be burdened with doctrines that might obscure as much as they reveal. The great thing with math models is to be acutely aware of their limitations – it is very common among mathematical modellers to make risibly high-falutin’ claims for their work. Scoundrels, all too many of them.

  18. If Sprachwissenschaft were doing well as an empirical science, a research tradition would be viewed as prestigious precisely because it was a demonstrably useful tool in such situations.
    But it’s not, except in the sense that Marxism-Leninism was “doing well as an empirical science” in the USSR. As long as you can get away with endless tweaking of your epicycles, I mean transformational rules, to make everything fit the sacred framework, everything’s hunky-dory. Since planes don’t fall out of the sky as a result, there’s no perceived need to change the situation, which after all brings tenure and professional respect to all concerned.
    Now, perhaps the real problem is that the actual “frameworks” currently on offer aren’t actually very good, and have acquired their prestige for other reasons.
    Bingo!

  19. Claire,
    *Everything* in linguistic analysis is a theoretical construct
    I’m not sure I understand what you’re saying correctly, but if I do, then let me just point out that we’re talking about two different things: first, we have a metalanguage that gives names to observable phenomena and relationships between them, even if the boundaries between the phenonena are not always that clear and one term may not be universally applicable. And then there is a metalanguage (or metalanguages) that invents concepts based on its limited understanding of one part of the field and then seeks to impose them universally. In other words, there is a world of difference between talking about nouns, tenses or consonants and talking about transformations, optimality or movements.
    I have no problem with expectations about what languages can and cannot have
    I do, because it goes against what science is all about. Also, of what use would this be?

  20. J. W. Brewer says:

    Speaking of perhaps-now-superseded epicycles, I opened a box of old books today that had for some reason been in my office rather than home for the last umpty-ump years and came across the texts I was compelled to purchase for an introductory syntax class in the fall of 1985: C.L. Baker’s Introduction to Generative-Transformational Syntax and Andrew Radford’s Transformational Syntax: A Student’s Guide to Chomsky’s Extended Standard Theory. I would say “the horror, the horror,” except I think my grade in the class reflects that I probably didn’t really read either book.

  21. I’m sympathetic to the idea of approaching new languages — and for that matter familiar ones — without (clinging to) preconceptions.
    But there’s another danger here, which anyone who has looked seriously at more than a few grammars will recognize. It’s common to find that the same phenomenon is described in superficially quite different ways, to the point that it takes quite a lot of work to see the relationship. This is just as likely to happen because different preconceptionless authors develop their ideas in random directions, as because different ideologically-committed authors bow to different theoretical idols.

  22. michael farris says:

    “I have no problem with expectations about what languages can and cannot have
    I do, because it goes against what science is all about. Also, of what use would this be?”
    In fieldwork kinds of situations the fewer preconceptions the fieldworker has the more likely they are to find and understand the really typologically interesting parts of the language.
    I can’t imagine any real linguist believing in theoretically unlimited diversity but it’s a good heuristic guide when faced with a new language.
    In the field methods classes I’ve taken (and led) I found a basic pattern: Something I thought I knew about the language ahead of time would lead me astray and make things more difficult to understand (and make me feel like a big idiot).
    The field methods language that went most easily was the one I knew least about (there were the usual kinds of difficulties and unresolved issues but I never had the feeling that I’d wasted a lot of time chasing after the wrong goat (as it were).
    Of course lots of kinds of linguistics need to work within a framework but no single framework is going to work for everything (or every language).

  23. To second bulbul’s point, I think one has to distinguish between frameworks as notational conventions, which often bury their durable data beneath a shadowy canopy of notational ephemera, and frameworks of widely accepted analytical types and terms, which make novel data decipherable and comparable to what others already know.
    I remember attending a fellow PNG fieldworker’s dissertation defense before I had finished my own. He had produced a 600-page grammar of a previously undescribed language but was taking flak from a theory-peddling professor for being too eclectic in choosing which theoretical devices to employ when trying to elucidate various phenomena. The professor (who refused to sign the dissertation) felt it more important to tweak some passing ostensibly universal framework toward perfection for his tiny cadre of disciples (and to one-up his theoretical opponents) than to make the description more useful for a more diverse audience in later generations. I was incensed, and made a lifelong enemy by speaking up and defending electicity in grammatical descriptions. He was especially pissed that I characterized his theory as “fly by night” when compared to a living language.

  24. Healthy empiricism is always a good thing. I’d had tremendous misgivings about the current state of linguistic theory until I was drawn to corpus linguistics, with its goal of explaining actual language in actual use.
    Then again, data by itself has no more scientific insight than words by themselves have meaning. Science is a matter of active interpretation, and therefore involves building a model (although, as MattF rightly says, it can be a model you are perfectly unaware of).
    I find it surprising that this particular horse is being beaten back to life after the tremendous empirical work science and technology scholars have done to bury it. It is true that frameworks set up expectations, but without those expectations we would have no idea of where to look for meaningful patterns. If the weight of amassed evidence forces us to review some aspect of the framework (or even the whole of it) every so often, I don’t see any problems.
    Haspelmath’s arguments make me think of a person who, arguing that sight yields only a partial image of the material world, thinks it’s just as useful to walk about with eyes closed.
    (Nota bene: none of this should be construed as an endorsement of Chomskyan syntactical theory, which I believe has outlived its usefulness and become a dangerous and bothersome zombie. But I think the problem lies in how the theory has been taught and used, not least by the old man himself, than in its merely being a theory. In the words of Robin Lakoff, generativism means “accepting the impossibility of saying almost everything that might be interesting, anything normal people might want or need to know about language”. But that’s not true of every theory, and in fact people like Michael Stubbs go to great pains to articulate the speaker’s experience of the language with theoretical systematisations.)

  25. (Cross-posted at Jabal al-Lughat.)
    I’ve been reading Martin Haspelmath’s other papers, and I found one of them on the European Sprachbund [PDF scanned sideways] very interesting. This one’s on equatives and similatives, but it’s the Sprachbund itself, which he calls “Standard Average European” in a hat tip to Whorf, that really gets my attention. There are a couple of other papers, not online, that apparently discuss the idea.
    The core languages of the Sprachbund are Romance, Balto-Slavic, West Germanic, and the languages of the Balkan Sprachbund. The periphery includes North Germanic, Hungarian, Finnic, Armenian, and Georgian (perhaps because of Greek influence on the last two?). English and French are on the boundary between core and periphery. The weird languages of Europe — Celtic, Basque, Maltese, Turkish, the other Uralic languages, and the languages of the Caucasus — are definitely excluded.

  26. John,
    have you read Heine and Kuteva’s “The Changing Languages of Europe”?

  27. Huh. Here’s the description of the book from that publisher page:

    This book shows that the languages and dialects of Europe are becoming increasingly alike and furthermore that this unifying process goes back to Roman times, is accelerating, and affects every European language including those of different families such as Basque and Finnish. The unifying process involves every grammatical aspect of the languages and operates through changes so minute that native speakers fail to notice them. The authors reveal when, how, and why common grammatical structures have evolved and continue to evolve in processes of change that will transform the linguistic landscape of Europe.

    Sounds… interesting, but I’m always dubious of these groundbreaking, earthshattering new theories. Are people taking it seriously?

  28. Oy vey. Ignore that description, besides from being way too marketingy, it is even slightly misleading. The “every grammatical aspect” bit, for example, is almost certainly not what the authors are saying. In fact, they focus on a small number of structures (e.g. articles, possessive perfect, comitative vs. instrumental) and analyze their history and spread in quite detail. And as for the earthshatteringness, they are quite reserved and skeptical/critical of real grandiose theories like Euroversals (features unique to SAE languages) and Europemes. I do have my reservations/criticisms*, but all in all, it is a rigorous work on typology and historical linguistics and the ideas in it should be taken seriously.
    *For example, there’s this map on page 119. Anybody venture a guess why it made my blood boil?

  29. Perhaps because the authors chose a map of such small scale that it didn’t allow them to fit horizontal stripes on to the Faro Isles? I’m with you on this, graphically it’s a disaster.
    I don’t know if you noticed, but they claim to have no info on Slovakia & Czechia. What a thing to have to admit in a book about European languages. I’d want some money back.

  30. I don’t know if you noticed
    I did, hence the reference to the temperatures inside my vascular system.
    What a thing to have to admit in a book about European languages.
    Right? Right??? To be fair, they had some Czech and Slovak examples in the chapter on possessive perfectives (one for each), but the Czech was horrible misspelled.
    Still, a great book. Just needs some additional work.

  31. marie-lucie says:

    Bulbul, I was shocked to see those gaping holes in the map, in the middle of Europe.
    On the other hand, even though I consider myself a linguist, I must confess to total ignorance about “possessive perfectives”. Bulbul, can you enlighten me?
    Graphic design: is the map all black and white, or is there some colour in the published version? the way it shows on my screen, with the exception of “stage 1” I cannot tell the key squares from each other.

  32. Bulbul: the fact that they have “no information” on Moldavia either, despite the language there being (basically) Romanian, which they seemingly do have information on, is even odder (unless they lacked information on the impact upon the non-standard Romanian of Moldavia of Russian, which indeed might have an effect on the syntax of articles). I read the book some time ago and had the impression that, interesting though it was, it had been researched/written in a bit of a hurry: the map confirms that impression.

  33. J. W. Brewer says:

    “Little is known of the grammar of the natives of the mysterious Hermit Kingdom of Czecho-Slovakia. Ever since the powerful shogun Tomasu Masaryoku expelled all foreign missionaries from its shores, restricted international trade to the Dutch East India Co.’s meager toehold in Bratislava harbor, and ordered summary capital punishment for anyone found engaged in linguistic fieldwork, typological scholars with an instinct for self-preservation have found other topics for their research.”

  34. m-l,
    To quote the book:
    “What we call the possessive perfect has been referred to by a wide range of labels, such as possessive construction, ‘have’-perfect, Romance perfect, stative perfect, resultative past, resultative perfect, perfect II, and the like.”
    Whike Slavic languages use the be-perfect, they also display a type of possessive perfect structures, so:
    Napísal som.
    COMPL-write-PAST.PART I am
    I have written
    vs.
    Mám napísané
    I have COMPL-write-PASS.PART
    I have written
    There are pragmatic and possibly stylistical differences that need a good looking into, but it’s definitely an areal feature. I blame the Germans.
    M

  35. Etienne,
    That is a very charitable explanation. My suspicion, based on their choice of examples and references, is that this is one those broad comparative studies where the authors are pretty good with one group of languages (both practically and theoretically), but not that good with others. Happens a lot, unfortunately.
    This shouldn’t distract from the overall value of the work, it’s just something that should be remedied in further research.

  36. David Marjanović says:
    after reading 20 “framework-free” grammars, you will begin to see patterns. Well, shouldn’t something be done to explain these patterns?

    AFTER is the operative word here and the whole point. You first study the subject and then make theories.

    And then you test your theories. How? By studying more of the subject – in many cases that’s going to mean more languages – and looking if your theories apply to them.
    Indeed, for this it doesn’t even matter how you formed your theories in the first place, whether by looking at the data and waiting for a pattern to emerge (induction) or by wishful thinking or whatever. The critical part is the test.
    (Basic science theory. Almost never taught. Somehow, scientists are expected to absorb it by osmosis.)

    *Everything* in linguistic analysis is a theoretical construct, from the way we write the examples (IPA anyone? Phonemes?) to words to paradigms to discourse.

    Hey, it’s only science.
    🙂

    But there’s another danger here, which anyone who has looked seriously at more than a few grammars will recognize. It’s common to find that the same phenomenon is described in superficially quite different ways, to the point that it takes quite a lot of work to see the relationship. This is just as likely to happen because different preconceptionless authors develop their ideas in random directions, as because different ideologically-committed authors bow to different theoretical idols.

    QFT.

    For example, there’s this map on page 119. Anybody venture a guess why it made my blood boil?

    *Picard & Riker double facepalm*
    Horror. They took a political map, added about 10 extra lines to it, and then added the hatching within 10 seconds. And I do hope it’s in color somewhere.

    “Little is known of the grammar of the natives of the mysterious Hermit Kingdom of Czecho-Slovakia. Ever since the powerful shogun Tomasu Masaryoku expelled all foreign missionaries from its shores, restricted international trade to the Dutch East India Co.’s meager toehold in Bratislava harbor, and ordered summary capital punishment for anyone found engaged in linguistic fieldwork, typological scholars with an instinct for self-preservation have found other topics for their research.”

    Day saved!

    There are pragmatic and possibly stylistical differences that need a good looking into, but it’s definitely an areal feature. I blame the Germans.

    That would explain the existence of have-pasts in Czech and Slovak (do they exist in any other Slavic languages?), but not any difference in usage. German isn’t English, where “I wrote” and “I have written” don’t mean the same thing; and while (as in older English) both “have” and “be” are used, which one is used depends on the verb (again as in older English), it’s not possible to use both with the same verb as in your examples.
    While I am at it, who invented the have-past, and how often? It’s present today in English and German, but Wikipedia says it only appeared near the end of Old High German (so let’s say around the year 1000); was it imported from Romance twice separately, or what? – I recently found “[…] habent instituta supplicia” in De Bello Gallico, in a context where it’s impossible to translate as anything but “they have instituted punishments”, it’s not ambiguous like “Caesar urbem occupatam habet” (which could be “Caesar holds the occupied city”).

    Why does the German prefield have one constituent in most circumstances and zero in polar questions, and not 3 usually but sometimes 2?

    What does this mean? Could you give me examples?

  37. michael farris says:

    “have-pasts in Czech and Slovak (do they exist in any other Slavic languages?”
    I think they do in Polish. Some examples from google (note: the passive participle agrees with the object of have (unlike the old active participle ending in -ł that agrees with the subject and which forms the most common/only past tense in most modern Slavic languages).
    Mam napisaną pewną piosenkę.
    I have written certain song. I’ve written a song.
    (or more awkwardly I have a song written (by me).
    Mam kupiony telefon w erze.
    I have bought phone in Era(company) – I’ve bought an Era telephone.
    Film mam obejrzany do ostatniego odcinka
    Film I have seen to last episode – I’ve seen up to the last/most recent episode of the series.
    Masz odrobione lekcje?
    you have done lessons – Have you done your lessons/homework?

  38. michael farris says:

    I’ll just add that all the sentences I quoted could just as easily be expresed with the regular past tense (alternating gender of the subject randomly here)
    Napisałem pewną piosenkę
    Kupiłam telefon w erze
    Obejrzałem film do ostatniego odcinka
    Odrobiłaś lekcje?
    To me (non-native speaker) there’s a definite stylistic/pragmatic difference between the two forms but I have little idea what it might be. I think one partial difference is the forms with the past tense simply describe actions while the first three forms with ‘have’ point toward the future. The first three clearly have an element of ‘what now?’ about them. The last also suggests that finishing the homework is a condition to doing something afterward. But I could be wrong.

  39. David Marjanović says:

    Interesting.

    the first three forms with ‘have’ point toward the future. The first three clearly have an element of ‘what now?’ about them. The last also suggests that finishing the homework is a condition to doing something afterward.

    Curiouser and curiouser! Are they still a bit like “so, now I hold that finished letter in my hands, and what should I do with it now”?
    Because that would explain Caesar urbem occupatam habet quite nicely. And it makes habent instituta supplicia imaginable as “they have well-established punishments at their disposal to deal with such situations”.

    note: the passive participle agrees with the object of have

    From Italian, this was apparently reintroduced into French, but only when the object is a pronoun/clitic: j’ai écrit les lettres vs. je les ai écrites.
    In German, the participle agrees with “have” itself: lacking any gender/number/case ending, it looks like an adverb. Even English, where the same lack of an ending is precisely what makes it look like an adjective, doesn’t go that far.
    =========
    Do Polish, Czech or Slovak ever form pasts with “be” and the (passive) past participle?

  40. Bulbul: *sigh* Two linguists discussing linguistics…obviously understanding one another will be difficult, as linguists can’t communicate…what was my charitable explanation? The one explaining why Moldova is as blank as the Czech Republic and Slovakia, or my claim that the flaws of the book are due to its having been written in a hurry?
    David: There is an Indo-Europeanist, Bridget Drinka, who has argued that the “have”-perfect first arose in Greek and thence spread to Latin, and from there to (inter alia) Germanic, and presumably from German to Czech, Slovak and Polish (and I believe Sorbian also has a “have”-perfect). I heard the argument at a talk of hers, nearly a decade ago, and thus am unsure where she has presented this in print.
    David, Michael: the Polish contrast sounds very similar to the contrast in French between J’AI FAIT QUELQUE CHOSE “I did something” versus J’AI QUELQUE CHOSE DE FAIT “I have a thing which has been done” (So now what happens?).

  41. michael farris says:

    David: I’ve heard of something like that (including ‘passive’ past particples for intransitive verbs) but it was about either some dialect of Kashubian and/or local Polish variety in/near Kaszubia. But details are hazy (it was a long time ago).
    In mainstream Polish those would be present tense adjectives and/or present passives.
    Etienne: interesting, though bear in mind I’m not a native speaker and I could easily be wrong.
    Also, as far as I know there’s no conscious feeling of have + passive participle being a tense in any formal sense. I don’t remember seeing them in teaching grammars (or reference grammars for that matter though it’s been a long time since I looked at a Polish reference grammar).
    Since I work with a bunch of Polish linguists I guess I could ask but they don’t specialize in Polish (or are very, very prescriptive about Polish).

  42. Etienne,
    due to its having been written in a hurry
    That’s the charitable explanation, yes.
    The chapter on possessive perfects cites one of Drinka’s papers.
    David,
    have-pasts in Czech and Slovak (do they exist in any other Slavic languages?
    Oh yes. You got michael’s examples from Polish, virtually identical structures in Ukrainian, Sorbian, Slovenian… Virtually all Slavic languages have a variety of the possessive perfect. Heine and Kuteva speak of development stages 0-3 where 2 is what you’ll find in most Western European languages (“The construction expresses an event that occurred prior to the point of reference and has current
    relevance.”). Stage 1 – resultative perfects – is what most Slavic languages have, only North Russian, Southern Thracian Bulgarian and Southwestern Macedonian have stage 2 structures.
    Look, anybody wants, um, full access, you know where to find me.
    not any difference in usage
    Like many things, the Germans introduced it, what we did with it is a completely different thing.
    who invented the have-past, and how often?
    Ha, that’s the question, innit? Those arguing for monogenesis usually go back as far as Ancient Greek (see Drinka) > Latin > Romance > Germanic. Others argue for an independent innovation in Germanic and possibly Northern Russian.
    Do Polish, Czech or Slovak ever form pasts with “be” and the (passive) past participle?
    If by “passive past participle” you mean -tý/-ný participles (as opposed to -l participles), then no, at least for Czech and Slovak.

  43. David Marjanović says:

    There is an Indo-Europeanist, Bridget Drinka, who has argued that the “have”-perfect first arose in Greek and thence spread to Latin, and from there to (inter alia) Germanic

    Huh. Fascinating. bulbul, thanks for the Google Books link; 5 pages are “not part of this preview”, but the rest is fairly convincing on its own! Footnote 12, which explains what an “actional perfect” is and uses the English “present perfect” as its example, once again drives home how unusual German is in lacking such aspectual considerations. So, if the English usage is original, I can easily imagine the borrowing of the whole category of “actional perfect”, together with the way to form it, along a chain of language families. 🙂

    Oh yes. You got michael’s examples from Polish, virtually identical structures in Ukrainian, Sorbian, Slovenian… Virtually all Slavic languages have a variety of the possessive perfect.

    Huh. Textbook Russian and a superficial glance at Serbocroatian have apparently given me a completely wrong impression. 🙂

    Others argue for an independent innovation in Germanic and possibly Northern Russian.

    The “Russian (NW dialect)” example sentence in footnote 23 of the Google Books link looks like a really desperate calque to me…

    If by “passive past participle” you mean -tý/-ný participles (as opposed to -l participles), then no, at least for Czech and Slovak.

    Thanks.

  44. while (as in older English) both “have” and “be” are used, which one is used depends on the verb (again as in older English)
    It’s true that in older English be was heavily preferred for intransitives and that by the beginning of the 19th century, have had mostly taken its place. But during the intervening time, the two were sufficiently prevalent, and with the same verb, that it doesn’t work to say that it’s entirely lexical. There’s something of context or meaning.
    In Modern English, participles used adjectivally lead to “they’re gone” vs. “they’ve gone.” Not analyzing the former as a perfect is somewhat the extreme of the semantic distinction.

  45. One of the core problems involved in determining whether the “have”-perfect is a case of diffusion or of parallel innovations lies in the fact that there aren’t that many languages with a separate verb “to have”, and even fewer of these are diachronically well-attested. Hence it is difficult to establish how “expected”/”normal” the rise of a “have”-perfect is.
    In the specific case of Latin-to-Germanic the strongest argument in favor of diffusion, to my mind, is the timing (I think Meillet was the first to make this point): the earliest known Germanic language, Gothic, has no “have”-perfect, whereas all later Germanic languages do (Runic Germanic does not, but the corpus is so limited that its absence there might be accidental): the fact that other Germanic languages have far more Latin loanwords than Gothic is much more consistent with a case of grammatical diffusion from Latin to Germanic than with a coincidental/accidental rise of a “have”-perfect in both groups.
    My hunch –for whatever that’s worth– is that the North Russian construction, in Europe, is the likeliest instance of a “have”-perfect whose rise is a language-internal matter.

  46. David Marjanović says:

    But during the intervening time, the two were sufficiently prevalent, and with the same verb, that it doesn’t work to say that it’s entirely lexical.

    Oh. I had no idea.

    In Modern English, participles used adjectivally lead to “they’re gone” vs. “they’ve gone.” Not analyzing the former as a perfect is somewhat the extreme of the semantic distinction.

    German not only makes this same difference, it has lexicalized it: sie sind weg vs. sie sind weggegangen.

    My hunch –for whatever that’s worth– is that the North Russian construction, in Europe, is the likeliest instance of a “have”-perfect whose rise is a language-internal matter.

    I said it looks like a desperate calque because it doesn’t even involve a verb “to have”. Russian has such a thing (иметь), but practically never uses it; the example sentence uses the normal workaround, “at … is”.

  47. Haspelmath’s latest paper on framework-free linguistics. He points out that the term Basic Linguistic Theory sometimes means ‘framework-free’ and sometimes does not, not uncommonly in the self-same paragraph.

  48. marie-lucie says:

    Thanks, JC!

    I went back to the beginning and thought that for sure I would have written a comment as I quite agree with Haspelmath. I am one of those linguists who wrote a grammar of an underdescribed language. I had thought that if I needed to use a theoretical framework, “functional grammar” (which had some minor currency at the time) would be a suitable, not too constraining framework, but ended up not using it. Instead my syntactic description was based on a “predicate-argument” structure which works very well for this language and related ones, rather than on the S = NP VP one most popular at the time, even though the latter was still very simple compared to what was to follow.

    I agree that it is not only frustrating but misleading to try to describe a language (or even a feature of this language) according to an existing theoretical framework. Inevitably the linguist has to leave aside, or even fails to notice, some facts which would contradict the framework. Another linguist wrote about one important feature in a language related to “mine”, choosing the language because it was reputed to have this feature, and meanwhile she neglected or misunderstood some facts, including some which would have strengthened her argumentation but that she did not suspect existed. Fortunately I had no such axe to grind.

    Here I did find a comment of mine, about the phrase “possessive perfective” which was unfamiliar to me but referred to the “have” perfect, with examples from Cesar including the one that was discussed here a short while ago, about the Gaulish sacrifices. Following my comment there is a longer discussion including the origin of the construction (probably in Greek) and its diffusion in several European languages.

    Etienne: j’ai FAIT quelque chose vs j’ai quelque chose de FAIT

    It seems to me that although the first example is definitely of a verbal construction (since only a past participle can be used in it), the second one is different since the construction with de can apply to adjectives as well as participles, for instance:
    – J’ai quelque chose de NOUVEAU. ‘I have something new’
    – J’ai quelque chose de JOLI. ‘I have something pretty’
    – Je n’ai rien de PRÊT. ‘I have nothing ready’
    – Je n’ai rien de PROPRE. ‘I have nothing clean’
    – Je n’ai rien de CONVENABLE. ‘I have nothing suitable’
    and many similar ones including with participles:
    – Je n’ai rien d’ÉCRIT sur ce sujet. ‘I have nothing written on the topic’ (by me or others).

    The adjectives and participles here are in prepositional phrases headed by de, not directly in verb phrases.

  49. His example using Tagalog is wonderful. When I made a very tentative attempt to start learning Tagalog many, many years ago, I could not for the life of me figure out how ang was supposed to be used. The analysis he gives explains why! A good reason not to go with old-style grammatical analyses based on Latin.

  50. David Marjanović says:

    Long, long ago…

    I (a linguist and specifically a syntactician) find these arguments unpersuasive. The article proceeds by pointing out flaws of various “framework” analyses, but the proposed replacements are no better. In fact, they recapitulate the deficiencies that led to generative grammar theories in the first place. (Why does the German prefield have one constituent in most circumstances and zero in polar questions, and not 3 usually but sometimes 2?

    As a paleontologist, I think it’s completely hopeless to try to explain all such features from synchronic constraints alone. “Everything is the way it is because it got that way.” You can’t go from anywhere to anywhere within 1500 years.

    In this particular case, though, there could well be a pragmatic constraint: most sentences have fewer than 5 constituents, so even if you put all but the 2 you need for the bracket into the prefield, you won’t reach 3.

    Now I’ll go back to reading all the papers linked to in this thread…

  51. David Marjanović says:

    Oh, turns out Haspelmath agrees in the first paper (p. 16): “Since all languages have a huge amount of properties that are due to historical accidents and cannot be explained except with reference to these accidents, true explanation in linguistics is restricted to explanation of language universals.”

  52. marie-lucie says:

    I think it’s completely hopeless to try to explain all such features from synchronic constraints alone.

    An example is generative phonology, as for instance in Chomsky & Halle’s “Sound Pattern of English”, where all morphophonemic alternations (eg wife/wives) had to be explained by synchronic rules. It meant that separate classes of words had to be set up, notably for the various methods of noun plural formation. Of course most of those classes are also explainable by the history of the language, including rules that are no longer productive (as in this example) or that are morphological rules belonging to other languages from which specific words have been borrowed (eg bacterium/bacteria). Eventually some linguists dared to suggest that historically-based explanations might be acceptable. I think that SPE is no longer the state of the art. (As I have mentioned a number of times, I stopped worrying about the twists and turns of Chomsky-inspired theories quite a while ago, so I don’t claim to be up-to-date about the topic).

  53. David Marjanović says:

    Also from high above:

    2. For a more specific example see this paper where he challenges the notion of “word” (therefore the morphology/syntax distinction) as a linguistic universal. Of course, if you’re disputing words you might as well deny word classes.

    The paper on denying word classes has moved to here. This trick does not work for the paper on disputing words; I can’t find it.

  54. David Marjanović says:

    One of my favourite papers on the topic is David Gil’s ‘Escaping Eurocentrism: fieldwork as a process of unlearning’, in Newman and Ratliff’s excellent ‘Lingusitic Fieldwork.’ (Daniel Hieber has a nice summary of the topic here http://danielhieber.com/2011/08/17/escaping-eurocentrism-in-language/)

    That page no longer exists. Hieber’s chapter is on Google Books, but interestingly it’s not in his online CV, where the very string eur does not occur…

    Two other interesting things are in his CV, though: apparently Chitimacha is being successfully revived, and Hieber doesn’t think it’s related to Toto-Zoquean after all.

  55. David Marjanović says:

    Oh. Oops. Addendum to my comment in moderation: of course it isn’t on Hieber’s CV – what I’ve found is the original by Gil.

  56. @marie-lucie

    Ideally, an SPE-type model should contain Verner’s Law as a synchronic rule to account for was/were (after all, the SPE does have rules that emulate Trisyllabic Shortening and the Great Vowel Shift, velar fricatives in underlying phonological representations, and other medieval stuff).

    I have followed Martin Haspelmath’s work for years and am very much in sympathy with his approach. But then, as a historical and evolutionary linguist, I tend to view language as primarily a populational phenomenon, not the formal model of an information structure inhabiting the Ideal Speaker’s mind.

    @michael farris

    (1) Mam napisaną pewną piosenkę.
    (2) Mam kupiony telefon w erze.
    (3) Film mam obejrzany do ostatniego odcinka
    (4) Masz odrobione lekcje?

    I, personally, might use the (3) and especially (4), but not really (1) and (2), perhaps because my native variety of Polish has not been strongly influenced by German. To me, the construction mieć + passive participle expresses the completion of a task. (1) and (2) would be OK if rephrased in a way that makes ‘song’ and ‘phone’ definite:

    (1a) Piosenkę mam już napisaną ‘I have the song already written’.
    (2a) Telefon mam już kupiony w Erze ‘I’ve already bought the Era phone (you know, the one I planned to buy).’

  57. ə de vivre says:

    marie-lucie:
    Surely an English speaker’s knowledge that the plural of ‘wife’ is ‘wives’ and not ‘wifs’ is some kind of synchronic ‘thing’ (regardless of how you formalize it), insofar as any given speaker’s competence is by definition synchronic, isn’t it?

    It sounds like you’re referencing an idea proposed in some later versions of SPE where the order that phonological rules are applied in a given language is a reflection of historical changes in that language. But the learnability problem that presents isn’t unique to universal grammar approaches. If all a baby has access to are the surface forms [waɪf] and [waɪvz], they’re not going to be able to figure out that a historically present vowel is responsible for the form [waɪvz]. Likewise bacterium-bacteria plurals would have to be learned as an irregular pattern just like native ox-oxen or goose-geese. The generativist point (which doesn’t seem to rest on any specifically generativist assumptions) is not that specific linguistic forms can’t be explained in terms of historical development, but rather that they also have to be learnable for each new generation of speakers.

    Or is the claim that evidence from historical language change and borrowing could falsify the existence of some or all universal linguistic primitives? If that’s the case I’m afraid I don’t follow.

    Re: the Gill chapter on Eurocentrism
    I’d be curious to see a quantitative comparison of proposed null elements in generativist grammars of European versus non-European languages. Based solely on my own impressionistic experience, I’m not convinced you’d wind up with more ∅s in non-European languages than in European ones. Regardless of your feelings about null elements, I’m not sure you can blame them on European provincialism.

  58. marie-lucie says:

    Piotr, ∂ de vivre: I know that SPE and similar treatments are meant to account for a speaker’s knowledge, regardless of historical developments, but in my experience many speakers would like to know why (for instance) English plurals are so varied and subject to so many exceptions, and they are glad to learn the historical reasons for those exceptions.

    I too am a historical linguist, though not an Indo-Europeanist.

  59. Absolutely. The only explanation why the plural of wife is wives is historical: in the prehistory of Old English voiced fricatives were devoiced word-finally, and voiceless fricatives were voiced medially if flanked by vowels of sonorant consonants. Note that even if we know this rule, we still don’t know if wife originally ended in a voiced or voiceless sound, since in both cases we would get the same distribution of OE allophones. As a matter of fact, the final fricative was originally voiced (as in the modern plural), but synchronically in present-day English we reverse the historical development and say that voiceless fricatives get voiced before the plural ending in some nouns (but not in others, so language learners have to memorise them).

    By the way, the voiced fricatives in loaves, paths and houses reflect the corresponding voiced realisations of medial /f, θ, s/ in OE hlāfas, paþas, hūsas (all of them nasculine), but OE wīf was a neuter with an unchanged plural, so the OE word for ‘wives’ was also wīf with a final /f/. The voicing in the s-plural is analogical (transferred either from the plural of masculines like wulf, or from the genitive and dative sg. and pl. of wīf itself) — a Middle English innovation. Of course native speakers today use wives because that’s what they have learnt. They neither know nor care why it’s wives rather than wifes. The vocing has no modern function and no synchronic motivation. It’s just a caprice of history.

  60. Surely an English speaker’s knowledge that the plural of ‘wife’ is ‘wives’ and not ‘wifs’ is some kind of synchronic ‘thing’ (regardless of how you formalize it), insofar as any given speaker’s competence is by definition synchronic, isn’t it?

    Yes, of course the behavior is synchronic. What makes SPE so fundamentally lunatic is the claim that this behavior is dictated by a synchronic rule, that every quirk of English usage is somehow rule-governed. Natural languages simply aren’t as regular as that, and the claim that they are involves manufacturing increasingly specialized and gerrymandered “rules”.

    This irregular regularity doesn’t just apply to phonology, but to syntax too. Here’s a table of the syntactic uses of the English wh-words from Geoff Pullum’s paper “Rarely Pure and Never Simple”:

    Usable in open-ended interrogatives:who, whom, whose [+human only], what, which, where, when, how, why.

    Usable in integrated (mostly restrictive) relative clauses: who, whom, whose, which, where, when, why.

    Usable in supplementary (mostly non-restrictive) relative clauses: who, whom, whose [almost always +human], which, where, when.

    Usable in fused or headless relative clauses (where the relative pronoun is also semantically the head of the relative clause): what, where, when, how, while.

    There is simply no hope of categorizing all these words such that they follow a regular pattern, and yet there are synchronic consistencies: Standard English simply doesn’t allow sentences like “*There was a secret plan what he had not told me about”, though some other varieties do. Historically, of courses, this comes from the fusion of true-interrogative and true-relative words still kept distinct, e.g., in the Indic languages, followed by millennia of mixing and muddling to follow.

  61. But there probably is a synchronic rule in English that nouns ending in a long syllable with /f/ as a coda voice the /f/ in the plural, wouldn’t you say? (I’m sure there are exceptions, of course.) I mean trying to account for was/were synchronically does strike me as ridiculous but I don’t think wife/wives is anywhere near as arbitrary to contemporary speakers.

  62. (Not disputing of course that there is a clear diachronic reason for that behavior, and knowing that reason as well is far more useful than knowing the rule alone, whether your goal is to understand English syn- or diachronically.)

  63. ə de vivre says:

    but in my experience many speakers would like to know why (for instance) English plurals are so varied and subject to so many exceptions, and they are glad to learn the historical reasons for those exceptions.
    You’ll get no argument from me about that. Seems to me ideally theories of historical change and individual learnability should be able to reinforce each other. A language can’t change if it can’t be learned.

    What makes SPE so fundamentally lunatic is the claim that this behavior is dictated by a synchronic rule, that every quirk of English usage is somehow rule-governed.
    What would a linguistic approach that doesn’t assume English is defined by a system of synchronic rules (or whatever formalism you want. Probabalistically ranked constraints seems to do the best job in phonology these days) look like? Even functionalist and non-UG approaches claim that language is rule goverened, UG just makes the extra claim that there’s a universal set of primitives.

    The Pullum paper appears to be addressing the failures of “grammar” as it’s taught in primary and secondary school. From what I can see it stays pretty neutral with regards to the (non)universality of syntactic primitives. In any case the fact that the distribution of English complementizers isn’t symmetric or pretty doesn’t mean that their distribution has no regularity. How do you account for the fact that some dialects of English accept “There was a secret plan what he had not told me about” and some don’t without some variation on the statement that the grammars of the two dialects have a slightly different set of rules?

  64. nouns ending in a long syllable with /f/ as a coda voice the /f/ in the plural, wouldn’t you say?

    I wouldn’t say that, because the data just don’t look like that. The plurals in question are calves, elves, selves, shelves, halves, hooves, knives, life/lives, wives, leaves, sheaves, loaves, thieves, wharves, wolves and dwarves, scarves which are analogical, not inherited . This is a random-looking mixture of (historically) long and short vowels, and almost every one has a regular counterpart: next to loaves there is oafs; next to thieves there is reliefs; next to lives there is lowlifes; next to dwarves there is dwarfs, where both are analogical; next to wolves there is gulfs, where the o/u distinction is a mere spelling convention to avoid too many consecutive minims. Diachronically they can be accounted for, but (I contend) not synchronically.

    The rule you propose is a typical example of what I am calling a gerrymandered rule: it accounts for maybe half the cases, and it has about as many exceptions as instances. It’s ontologically simpler and theoretically neater to say that the 30-odd irregular nouns in English (excepting Latin, Greek, and Hebrew borrowings that came in along with their plurals) are just lexical exceptions synchronically, brute facts about the language, as much so as the fact that snow means ‘snow’.

    What would a linguistic approach that doesn’t assume English is defined by a system of synchronic rules look like?

    Well, an obvious alternative, though not the only one, is Pinker’s words-and-rules concept, in which there are rules, but there is also a parallel set of facts (not necessarily about words as such) which are simply not reducible to the rules. Linguists have always talked about lexical exceptions, but the connotation of exception is ‘something we could get rid of if only we were smarter’. I am suggesting that this is one of Bacon’s idols, and that some evolved systems are simply not reducible to rules in a sensible fashion.

    What rules, for example, dictate the names and shapes of the letters of the modern Greek alphabet? (While the alphabet is not ‘natural’ in the way that Greek itself is, it has definitely evolved to its present form; it is not an intellectual construct like Plato’s philosophy.) We can account for the names and shapes diachronically, but synchronically there is nothing at all about the shapes ΑΒΓ or the words alpha beta gamma that connects to the Greek phonemes /a/, /v/, /ɣ/ in a rule-governed way. Everything is the way it is because it got that way, and sometimes there just isn’t any better reason at all.

    How do you account for the fact that some dialects of English accept “There was a secret plan what he had not told me about” and some don’t without some variation on the statement that the grammars of the two dialects have a slightly different set of rules?

    You misunderstand me, and what I think Pullum is saying, more to the point. There is absolutely a rule that says what can’t be used to introduce an integrated relative in Standard English. What doesn’t exist is a rule to say which wh-words can and cannot introduce integrated relatives in Standard English. In other words, the fact that which can and what cannot serve this function is not explicable by any reasonable, non-gerrymandered theory. It is another brute fact, not about the phonology or the lexicon this time, but about the syntax, which is supposed to be so rule-governed and all.

    In short, regularity is nice work if you can get it, but (unlike Ira Gershwin’s view of falling in love) you can’t always get it no matter how hard you try.

  65. ə de vivre says:

    synchronically there is nothing at all about the shapes ΑΒΓ or the words alpha beta gamma that connects to the Greek phonemes /a/, /v/, /ɣ/ in a rule-governed way
    The arbitrary relation between signifier and signified is affirmed (barring a few limit cases) by all contemporary linguistic theories.

    In other words, the fact that which can and what cannot serve this function is not explicable by any reasonable, non-gerrymandered theory.
    Why do we think “which” and “what” should have the same distribution? Whether you get you units of analysis from UG or from a language-internal analysis without any claims to universality, the line of inquiry is largely the same. If two different words have overlapping but not identical distributions then we would want to find out a) if the determining factor is syntactic and if so , b) what syntactic environments determine the distribution. No theory of syntax claims that every member of a category based on traditional pedagogical methods must have the same distribution.

    What is the specific prediction of UG you think is falsified by this data? All that data shows is that {who, whom, whose, which, where, when, why} are not synonyms, which again, no one is claiming.

  66. marie-lucie says:

    learnability

    When children raised in an English-speaking environment become functional in the language, they go through a stage usually called “overgeneralization”, in which they apply productive methods of word-formation, most notably in the formation of past forms of verbs, so for instance hitted, holded, gived. It is obvious that they have now mastered a major rule of Modern English and are able to apply it consistently. Even though they hear and understand the “irregular” forms used by adults (which follow several distinct patterns, some of them only attested in one or two forms), such forms do not “sound right” to them and it can take quite a long time until they start using them themselves. Local or social varieties of the language, as well as individual adults, do not always agree about which “rules” to apply to some verbs, for instance using brang or brung on the model of sang, sung instead of standard brought, which is so different from the basic bring that it is very difficult to justify it by a synchronic rule.

  67. It’s ontologically simpler and theoretically neater to say that the 30-odd irregular nouns in English (excepting Latin, Greek, and Hebrew borrowings that came in along with their plurals) are just lexical exceptions synchronically, brute facts about the language, as much so as the fact that snow means ‘snow’.

    I can understand that argument, but I don’t agree that it necessarily produces the most useful results. The issue that interests me isn’t what’s more numerous in the lexicon, it’s what’s more natural. I’m pretty sure that on an adult wug test, I would give “heaves” as the plural of “heaf,” and not as a joke, either. (I would never say “dwarfs”, either, by the way, and “lowlives” feels much less ungrammatical than “shelfs” to me.)

    Maybe I’m atypical but I don’t think I’m unique, and I think it better describes the inner workings of such an idiolect to posit a voicing rule with exceptions than to brush aside the voiced plurals themselves as exceptions.

  68. Wait, I would say “dwarfs” – as a verb. “Look over there! Snow White positively dwarfs those dwarves.”

  69. The arbitrary relation between signifier and signified is affirmed (barring a few limit cases) by all contemporary linguistic theories.

    Ah, but they aren’t arbitrary, they are only arbitrary in Greek. In Semitic they make total sense: they apply the acrophonic principle to what were originally pictographs. They were designed for Semitic, where they are an intellectual construct: they are the result of historical evolution in Greek, where they are not.

    What is the specific prediction of UG you think is falsified by this data?

    I make no such claim (though Pullum may, I don’t know; see below). I only say that it can’t be rules, any more than turtles, all the way down. Eventually we reach the inherently irregular and inexplicable.

    In particular, Pullum says in section 2 that in “recent theoretical linguistics” a description of a language is supposed to be “a set of elegant general principles mapping a lexicon of words to a set containing all and only the grammatical sentences” i and in section 2.2 he speaks similarly of “simple and regular formal systems defined by a universal system of grammar that makes everything simply explainable and represents the language as a system of optimal or near-optimal design for human communication”. Is this characterization or caricature? I don’t know, and I doubt I can find anyone with a neutral point of view on the subject. There’s no doubt that some theoretical linguists sing songs that sound very like this, but whether they are in the mainstream or on the fringe is a question.

  70. I would give “heaves” as the plural of “heaf”

    Well, so might I. But you and I are (with all respect) language geeks. It was precisely because Tolkien was an ubergeek that he coined the word dwarves (there are only a very few earlier instances), though he knew as a philologist and lexicographer that dwarfs was the only modern plural (it was “Snow White and the Seven Dwarfs” long before Disney), and that the etymologically correct plural, had it survived, would be dwarrows. Of course the verb is regular, because denominal verbs normally are: fly has the preterite flew and so do fly up and fly down, but fly out (in baseball, not in anger) is from the noun fly (ball), and its preterite is flied out.

    and not as a joke, either

    I agree that it’s not a joke, but I do think it’s a language game. The appearance of Vaxen, boxen, Macintoshen, Emacsen in modern times do not actually constitute a revival of the n-declension, which was a relic even in Old English times, as pretty as it might be to think so.

  71. I was quite baffled by that appendix about the plural of “dwarf” as a child, as it happens. What sort of bizarre ironic humor could lie behind an apology for using “dwarves,” the actual correct plural? Are these dictionaries that allegedly only allow “dwarfs” in Westron or something? (If I had been a better scholar then, I would have checked an actual English dictionary, and my life might have been quite different.) This was my father’s copy of LOTR, though, so perhaps I had learned “dwarves” from him earlier.

    Anyway, this is getting into self-analysis and I suppose of little relevance to English as a whole, but the hypothetical “heaves” doesn’t feel to me like a language game the way “boxen” or “octopodes” or “meese” or “C. K. Scott Moncrievves” do. It’s possible that too much Aelfric has warped me permanently — although, to be honest, intervocal voicing is something I still get wrong when reading OE aloud — but a language game that is internalized to the point that it becomes automatic and doesn’t even feel ludic any more is hard to distinguish from… a rule.

  72. La Horde Listener says:

    Maybe gorillas could help.

  73. David Marjanović says:

    and dwarves, scarves which are analogical, not inherited

    In addition, while I haven’t encountered the spelling *rooves, plenty of people do pronounce the plural of roof that way, are aware that they’re doing this, and count this discrepancy as just another quirk of the English spelling system.

    All that data shows is that {who, whom, whose, which, where, when, why} are not synonyms, which again, no one is claiming.

    Consider German denn and weil. Both mean “because”; they are synonyms, and so it’s no surprise that my dialect lacks denn and makes do with weil alone. Yet, their syntactic behavior is quite different. Weil triggers finite-verb-last word order; denn triggers finite-verb-second word order. You can of course interpret this as weil introducing subordinate clauses (which have Vf-last order) and denn linking independent clauses (which have Vf-second order), but that’s obviously circular.

    (My dialect has free variation between the two word orders with weil and with no other word.)

    “meese”

    Die größten Kritiker der Elche
    waren früher selber welche.

  74. @Matt

    Re: “heaves”

    My question would be, freely admitting one might naturally arrive at “heaves” from a prompt of “heaf”, do you imagine a native speaker would reject a prompt of “heafs” as morphosyntactically ill-formed? My intuition is not. Somewhat channeling marie-lucie’s point above, I would say the criterion for a rule ought to be what it allows and rejects (as “sounding wrong”) rather than what it predicts. If we can accept “heafs” or “heaves”, but not “boxs”, doesn’t it seem more parsimonious to derive one rule and a handful of exceptions/exceptional *patterns*, rather than to derive one rule that reliably applies everywhere except the exceptions, and another that unreliably applies even to words where it conceivably could?

  75. marie-lucie says:

    Heaves: Is heaf an actual word (a noun), or a back-formation created for the purpose of this discussion? It seems to me that the only plausible description of heaves is as a verbal form, from “to heave”.

    Regarding the similar word eaves, I thought for a long time that it must be the plural of an unattested eaf, but apparently I was wrong. I encountered eave as the singular, but another source listed it as a back-formation from eaves. Does the potential eaf have a cognate in another dialect or related language? is there a reconstructed form?

  76. do you imagine a native speaker would reject a prompt of “heafs” as morphosyntactically ill-formed?

    They’d better not: after all, cloverleafs in the sense of ‘intersections’ is the only standard form, though ‘leaves of clover’ is clover leaves. (Plurals of headless forms are always regular.)

    Is heaf an actual word

    No. It is not even a back-formation: there is no noun heaves either, except the deverbal ones meaning ‘acts of lifting’, ‘acts of hurling’, ‘acts of vomiting’, etc. It is purely a wug-word, one offered to naive speakers to see what morphological patterns they will apply to unknown words. (In English, of course, the plural of wug has to be wugs, but matters are not so simple in other languages.)

    eaves

    The /s/ here is part of the root: OE efes ‘edge of a roof’ < ‘edge of the forest’. Eave is a back-formation, but not a stable part of the language: I do not have it, for instance.

  77. Elessorn, I don’t think that is a good test in this case. If a speaker accepts the proposal “heafs” they are just agreeing to hypothesize a word “heaf” that patterns with “grief” rather than “leaf”. It doesn’t tell us anything about the representation of those patterns in their mental grammar.

    Disregarding for clarity other exceptions like “boxes” and “sheep,” I think we can all agree on these two points:
    1. The basic pluralization strategy is “add /S/”
    2. For some words with a certain phonemic profile, the correct pluralization strategy is “voice the final consonant, then add /S/”

    Where we disagree is how 2 is applied:

    3a. The pluralization strategy in 2 IS NOT applied by default. It is, however, applied to the following arbitrary list of words: wife, elf …
    3b. The pluralization strategy in 2 IS applied by default. It is not, however, applied to the following arbitrary list of words: oaf, gulf … (I suppose you could supplement this with general patterns to the extent that you could define them synchronically: “proper nouns”, “words that sound kind of Latiny or foreign,” “words that have abstract or technical meanings,” etc.)

    My feeling is that if your genuine wug-reaction to “heaf” is “heaves”, it makes more sense to go with 3b, because you are applying the rule by default. If you would also accept “heafs”, that could just mean that you’re OK with adding “heaf” to the list of exceptions.

    Perhaps where we are going wrong is that what I see as the goal of a (hypothetical) project like this is not “write the simplest possible description of the language” but rather “write a description of the language that corresponds most closely to how the brain works with it.” It seems fair to assume that in most cases these two goals would correspond quite well, Ockham’s-razor style, but if you could replicate the spontaneous production of “heaves” in a proper controlled experiment, you would have evidence that at least on that particular they did not.

  78. marie-lucie says:

    JC, thank you for confirming what I thought for “heaf” (a back-formation even though it is a nonce form rather than a new word), and for clarifying the origin of “eaves”.

    leaf, leaves, leafs

    We have a perfect example in anglophone Canada: we have a maple leaf on our flag because of all the maple trees and their maple leaves, and we also have an almost-national hockey team called the Toronto Maple Leafs.

  79. @Matt

    The point about goals is well taken, though my goal, at least as I conceived it, was precisely yours: to best subscribe what is actually going on. Here my intuition, weigh it as you will for n=1, was that “heaf” probably wouldn’t *reliably* get “heaves” out of native speaker informants. I think it would in considerable numbers, especially with the verb sitting there to make it seem familiar, but not always. (I’d wager that “goaf” would get you a lot fewer “goaves” for the same reason). My point with the acceptability of “heafs” was simply that IF such a form is acceptable, we’re dealing not with a *rule* but with a grandfathered-in *pattern* that analogy can unpredictably (but never inexplicably) extend to other words that fit the profile.

    This is what I imagine is actually at work: a rule interfered with by an analogy, one suppiorted both by (1) history, and (2) non-disconformity with the “PLURALS end in /unvoiced consonant + s/ OR /voiced consonant + z/ OR /vowel + z/” distribution produced by the rule. I can see how “parsimonious” gave the impression I was going primarily for simplicity– my fault. I understand that it processes like a rule where it processes at all, but given the data, with subsets of words that f–>v always applies to, words it never applies to, and words it might apply to, based on dialect or idiolect, “exceptional pattern unpredictably extendable by analogy” seemed the best description of reality.

    Maybe this is a another case where dear old Latin is screwing us up, making declension seem an eminently rule-bound phenomenon with its well-behaved noun classes. Hittite, Greek, Sanskrit, hell, even Germanic I think give a lot better sense of how easily words shake it up with analogy.

    That said, the analogy-vs-rule argument ain’t gonna end in one dag.

  80. My point with the acceptability of “heafs” was simply that IF such a form is acceptable, we’re dealing not with a *rule* but with a grandfathered-in *pattern* that analogy can unpredictably (but never inexplicably) extend to other words that fit the profile.

    I agree that this is the situation, except that I don’t know what you mean by “never inexplicably”. It seems to me that exactly which words, or senses of words, analogy will apply to is precisely beyond explanation. Why should dwarf ‘member of a mythical race’ mostly accept the analogy of elf, but dwarf ‘small human being’ entirely reject it? (Such are the facts on the ground in our post-Tolkien era.) Why should strive and thrive be in process of regularization on the analogy of arrive and friends, while drive resists it absolutely? I know of no non-circular explanations of these facts.

  81. A better example perhaps is the Spanish weekday names: the -s in lunes, martes is by analogy with miércoles, jueves, viernes, but there is no such analogical suffix on sábado, domingo.

  82. You’re right, John, a poor choice of words there. At the time I was contrasting it in my mind with “predict”: we can’t say when we’ll get “-ves”, but when we do get it, it’s never somewhere where we’re surprised to have it. I.e., we can’t guess whether it will happen (predict), but when it does, we immediately grok the pattern behind it (explain).

    Still, not the best or clearest phrasing. Perhaps “unpredictably but never unexpectedly”?

  83. ə de vivre says:

    I only say that it can’t be rules, any more than turtles, all the way down. Eventually we reach the inherently irregular and inexplicable.
    I’m honestly confused by the problem you think irregular morphology presents. No one claims that it’s rules all the way down. From abstract point of view I don’t know what such a theory would look like. Eventually any linguistic theory gets down to its primitives. What makes generativism different from a purely functionalist acount is not the presence of rules, it’s that its primitives come from an inherant universal grammar rather than the functional uses of language.

    The standard generativist model has syntax organize a phrase, producing a series of terminal nodes that contain a bundle of features. These features are then sent to the lexicon where they receive language-specific morphological forms (which are the result of historical development) that are then sent to phonology for pronunciation. In English the fact that certain plurals are irregular would be stored in the lexicon. The point isn’t that English has irregular plurals, the generativist claim would be that cross linguistically plural nouns can be analyzed with a single set of linguistic primitives.

    If you don’t think irregular morphology can enter into regular syntactic relations, that’s one thing, but in that case you’re arguing against more than just generativist linguistics.

    Is this characterization or caricature?
    It certainly doesn’t resemble the generativist linguistic theories I encountered as an undergraduate. I’m most familiar with phonology in the optimality theory tradition. Here optimality specifically means relative optimality, and the entire reason it was developed was to account for the fact that language output often seems to be the “least bad” option, and that weighted (dis)preferences capture the kinds of attested variation in utterances than categorical rules can’t. The foundation of the theory is an attempt to capture the intuition that there are competing tendancies in what languages “like to do” that make it so any actual utterance can’t please them all.

    Specially the claim that generative linguistics “represents the language as a system of optimal or near-optimal design for human communication” is exactly the opposite of what generativists claim. Generativism is explicitly formalist. If anything it can be criticized for separating linguistic competence from its functional role as medium of communication and expression of social relationships. The primitives proposed by generativist linguistics are supposed to account for the formal distribution of linguistic units (segments, morphemes, syntactic constituents etc). Nothing in the theory would even allow you to ask the question whether it’s “optimal” for communication or not. Chompsky’s even gone further recently and made the claim that UG is built from cognitive functions that were originally selected for non-linguistic behaviour that have been jury-rigged for language in a situation analagous to Stephen J Gould’s “panda’s thumb”.

    Consider German denn and weil. Both mean “because”; they are synonyms, and so it’s no surprise that my dialect lacks denn and makes do with weil alone
    I think there are a couple different conversations going on in this thread, so I’m not sure if you’re directly responding to me or just adding information. Are you saying that this fact is something generativist syntax can’t account for?

  84. ə de vivre says:

    And by Chompsky I of course mean the dog not the homophonous but not homographic linguist Noam Chomsky.

  85. What makes generativism different from a purely functionalist acount is not the presence of rules, it’s that its primitives come from an inherent universal grammar rather than the functional uses of language.

    No doubt, which is why I am “Ni oui, ni non, bien au contraire!” on the subject. While the idea of language-specific rules is more congenial to me than the idea of universal rules, my serious claim here is that the notion of language as rule-governed behavior at all is inevitably a strong oversimplification of something far more complex. Pullum’s article would probably be much clearer if he disentangled these two points, as would some of my remarks above.

    (This is a great illustration of how you don’t have a position until you engage in debate, only various more or less confused opinions.)

  86. ə de vivre says:

    (This is a great illustration of how you don’t have a position until you engage in debate, only various more or less confused opinions.)
    Indeed. I’m agnostic myself when it comes to the strong-UG claim that there is a set of universal linguistic primitives that are both necessary and sufficient to explain all linguistic variation. In any case I was always more interested in phonology and semantics where the shadow of the history of linguistics as an institution during the 70s never seemed to loom as large in practical debates as it did in syntax.

    I think where we disagree is that I don’t see something like Pinker’s (who, at least as of ‘The Language Instinct’ believes in UG, though not of a strictly chomskyan flavour) ‘words-and-rules’ as qualitatively different from generativist models that allow for irregularity in the lexicon and analogically extended paradigms. Pinker may turn out to be right, but I don’t see any a priori reason that his theories (which I am admittedly not very familiar with) are structurally more desirable than their generativist competitors.

    Would you be more comfortable if we banned the word ‘rule’ and replaced it with ‘compositionally predictable derivation’? At least in phonology, the ability to move away from categorically applicable rules has had major consequences for the discipline.

  87. what I see as the goal of a (hypothetical) project like this is not “write the simplest possible description of the language” but rather “write a description of the language that corresponds most closely to how the brain works with it.” It seems fair to assume that in most cases these two goals would correspond quite well, Ockham’s-razor style

    Has anything we’ve learned about the brain turned out to represent the simplest possible way of doing things? In my layman’s impression, the governing principle of neuroscience isn’t Occam’s razor but “It’s Always More Complicated”.

    Why should dwarf ‘member of a mythical race’ mostly accept the analogy of elf, but dwarf ‘small human being’ entirely reject it? (Such are the facts on the ground in our post-Tolkien era.) Why should strive and thrive be in process of regularization on the analogy of arrive and friends, while drive resists it absolutely? I know of no non-circular explanations of these facts.

    I think the answers in these particular cases are actually pretty simple, namely (a) because of Tolkien and (b) because of frequency.

  88. David Marjanović says:
    Consider German denn and weil. Both mean “because”; they are synonyms, and so it’s no surprise that my dialect lacks denn and makes do with weil alone

    I think there are a couple different conversations going on in this thread, so I’m not sure if you’re directly responding to me or just adding information. Are you saying that this fact is something generativist syntax can’t account for?

    Sorry. I’m not at all familiar with generativist syntax, I just jumped into one of the discussions. Someone said that the words {who, whom, whose, which, where, when, why} behave differently (and thought that was a problem for a theory); you said that’s not a problem because they aren’t synonyms; I brought up an example of synonyms that do behave differently.

  89. Elessorn, thanks for clarifying. The responses in this thread have led me to believe too that “heafs” and “goafs” would probably be more common, against my own intuition as expressed in my first comment. A good example of why linguists should work with data, not their own navels, I guess.

    The way you put the issue makes sense to me, but it then seems to come down to the question of what the difference is between a “rule” and an “analogy”. To be honest I don’t believe in a strict separation between “lexicon” and “rules” (I’m basically a construction grammar guy), so in a sense, a speaker’s pluralization of “goaf” pretty much has to be by analogy; the question is which sort of noun they choose to group “goaf” with, not which abstract rule to apply to it.

  90. George Gibbard says:

    Here is a manifesto for a non-Chomskyian approach to linguistics (by my undergrad historical linguistics professor, no less):
    http://www.sean-crist.com/professional/human_language/index.html
    On this view, when an English speaker produces ‘wives’, they are not generating the form anew from ‘wife’, and a person saying ‘people’ does not go by way of ‘person’. Instead one selects the largest stored chunks of language in one’s mental lexicon that semantically correspond to parts of what one wants to say. The size of the mental lexicon is not to be minimized in our theorizing as it is in classical Generative Grammar.

    Meanwhile supporting Crist’s claims, I have seen Harald Baayen present psycholinguistic data indicating that regularly formed morphologically complex forms, if they are common, are stored and not re-generated every time they are invoked. So ‘books’ works psychologically more like ‘people’ than like ‘daxes’.

  91. J.W. Brewer says:

    Daxes? My native speaker intuition prefers “daxen” as the plural.

  92. This looks very much in line with where my experience and intuition have taken me, especially the sentence “Grammaticality is a gradient property rather than a discrete one.” I usually express this by saying there is no grammaticality except in the papers of grammarians: in the Real World there is only acceptability, which is obviously gradient. Otherwise there would be no ?-utterances, only acceptable utterances and *-utterances.

    But in line with what ə says, this is not merely non-Chomskyan; it is far more radical, transcending the differences between UG and non-UG theories of grammar.

  93. On this view, when an English speaker produces ‘wives’, they are not generating the form anew from ‘wife’, and a person saying ‘people’ does not go by way of ‘person’. Instead one selects the largest stored chunks of language in one’s mental lexicon that semantically correspond to parts of what one wants to say.

    That makes a lot of sense to me.

  94. marie-lucie says:

    To me too! Thanks GG and JC.

  95. Yeah, thanks, GG, that’s a fantastic manifesto. For a great and seminal argument along the same lines, check out Fillmore et al 1988: Regularity and Idiomaticity in Grammatical Constructions: The Case of Let Alone.

  96. there is no grammaticality except in the papers of grammarians

    In particular, native speakers cannot give grammaticality judgments (unless they happen to be grammarians), whereas they can and do give acceptability judgments, and this is the true source of data for linguists.

  97. See our brief discussion of let alone.

  98. Well, we are dealing with human beings so, of course, there are no yes or no, but degrees of maybe. That said, some of the maybe’s have very large probabilities of yes and no. And it is not an argument against creating categories and thinking in terms of categories if you can scare up a couple of questionable examples, the true argument should be how many (almost) sure cases there are compared to (real) maybe’s.

    Which remind me about this post What Does Probability Mean in Your Profession?. So, where do linguists place their question marks?

  99. Even such basically framework-free grammarians as Huddleston & Pullum can be seduced into thinking that the more the merrier is an idiomatic and fixed expression, whereas it is plainly an instantiation of an unusual, but quite productive, sentence type (which in no way resembles NP + VP): the more experienced the grammarian, the less likely they are to think in terms of rigid rules and exceptions to the rules, one might say. I pointed this out to Pullum some years back, but I don’t think he got my point for whatever reason.

  100. I note that Fillmore et al. do discuss the … the sentences, and point out that the bigger they come, the harder they fall is in their terms a substantive idiom inside a formal idiom: that is, it is interpreted as an uncarved block normally, but is also an instance of this unusual sentence type.

  101. David Marjanović says:

    Meanwhile supporting Crist’s claims, I have seen Harald Baayen present psycholinguistic data indicating that regularly formed morphologically complex forms, if they are common, are stored and not re-generated every time they are invoked. So ‘books’ works psychologically more like ‘people’ than like ‘daxes’.

    I remember reading about an Iroquoian language. A linguist wanted to write a dictionary. The speakers liked the idea, but wanted him to put not the abstract verb roots, but the actual forms into the dictionary, with prefixes and everything. It’s a polysynthetic language with regular morphology, so it’s no problem to correctly generate millions upon millions of verb forms. However, it turns out the speakers are quite reluctant to actually do that; they first make sure they don’t remember having heard the form and can’t avoid it. That made immediate sense to me.

  102. George Gibbard says:

    David Marjanović,

    However, it turns out the speakers are quite reluctant to actually do that; they first make sure they don’t remember having heard the form and can’t avoid it.

    do you mean “they first make sure they remember having heard the form and can’t avoid it”?

  103. No, I think he means “Before creating a potentially novel form, they make sure it doesn’t already exist but is nonetheless unavoidable in this context”.

  104. George Gibbard says:

    So then “they make sure they don’t remember hearing a different form with the desired meaning”?

  105. David Marjanović says:

    They make sure they need to create it and can’t just (with a bit more effort) retrieve that same form in their memory.

  106. This post now has more than 100 comments. We should give LH a gold record or something for that.

  107. marie-lucie says:

    DO, You are relatively new to LH! That record has been more than broken earlier.

  108. However, it turns out the speakers are quite reluctant to actually do that; they first make sure they don’t remember having heard the form and can’t avoid it.

    So if even regular morphological forms are learned as wholes and stored in the lexicon, that would seem to make it less necessary to posit underlying forms that are very far from the surface form.

    If taken to its extreme even something like the voicing of plural -s in English could be stored word for word, but associated with a pattern that lets the speaker construct the ‘right’ form with a bit more effort, if not already known — and then store it.

    And once it becomes too hard for learners to extract a given pattern from stimulus, they will start to produce ‘analogical’ shapes (following other patterns) or ‘regularized’ shapes — but still keep the learned shapes for words they’ve encountered often enough.

  109. Just so, which leads to a deep question: why do speakers of languages like English even bother with morphological regularity? As the example of German noun plurals shows, it’s not really any harder to hold the unpredictable plural of each noun in memory than it is to memorize the unpredictable phonetic shape of the noun in the first place. The answer is and can only be historical: that the dominance of the regular masculine a-declension was substantial even in Old English, reducing many other declensions to exceptional status (not a single member of the u-declension is irregular today), and when a huge bolus of French words in -s was added to it, the fate of all but 30-odd frequent nouns of other declensions was sealed.

  110. People bother with linguistic norms to prove that they are capable of learning them and can be accepted in the in-group. You could pose the opposite question: why don’t unpredictable forms accrete boundlessly over language history? (My answer would be, enough is enough, you’re let off the hook once you’ve mastered sufficiently many weirdnesses — with ‘sufficient’ being more for higher status in-groups, of course).

    But I’m not really talking about unpredictable forms — more about the standard explanation that ‘dwarf’ and ‘dog’ are both stored in the lexicon with the ‘default plural’ which is underlying ‘-s’, and /dog-s/ is then run through a ruleset that produces /dogz/. And if you have ‘dwarves,’ it would be stored as ‘voice final consonant and add -s’.

    What if the plurals are simply stored as /dwarfs/ and /dogz/, and if your father liked Tolkien, you have /dwarvz/ stored instead — none of them more of an exception than any other? And if you’re asked to pronounce {wugs} you have to retrieve the generalization you made when you were four and conclude that it’s /wugz/ (not just underlying /wug-s/).

  111. J. W. Brewer says:

    The approach discussed in recent comments seems to tacitly assume something about the structure of the human brain in which it is “cheap” to memorize individual lexical facts ad infinitum and equally “cheap” to search through a vast and perhaps disorganized mental lexicon, but “expensive” to either decompose a larger unit into constituent morphemes in a rule-driven way or compose a larger unit out of morphemes in a rule-driven way. I’m not saying that’s wrong and I’m not even saying it’s implausible because I just don’t know enough about the subject. But it’s not intuitively obvious and I would be interested in hearing about what sort of neurological evidence (or other evidence derived from the study of non-linguistic forms of human cognition) there is for the assumption.

  112. @John Cowan: After reading your last comment, I had to imagine the English language choking and spluttering as it had to swallow a huge bolus of French words.

  113. @JWB, I probably know less about the subject than you do — I’m throwing up the idea in the hopes that somebody will have evidence either way.

    From what I see it just seems that ‘lexicon does not contain predictable forms, production of surface forms is rule driven’ has been an assumption rather than a conclusion. So David’s Iroquioan example made me go hmm.

  114. “cheap” to memorize individual lexical facts ad infinitum

    It’s probably very expensive, but it seems clear that we have to do it. The 20% of oxygen, or is it glucose, that goes to the brain has to be used for something besides keeping us standing upright.

  115. marie-lucie says:

    JC: [in German| it’s not really any harder to hold the unpredictable plural of each noun in memory than it is to memorize the unpredictable phonetic shape of the noun in the first place.

    Irregular plural and other forms are not just memorized, the memory of them is reinforced by being heard from other speakers or seen in writing. Learners of English (especially those taught in formal classes) have to consciously learn child/children and the plural form might take a while to “stick”, but native speakers have heard and used these forms all their lives in myriad circumstances, so they are maintaining them in working memory rather than rememorizing them every time. On the other hand, they might encounter (or need to use) the verb thrive only once in a great while, but have little occasion to hear or use its past tense form, so little in fact that hearing or seeing throve might not be recognized at once (I am not sure if I am consciously creating this form or remembering it! thrived sounds better to me – let me know which is which).

  116. ə de vivre says:

    Throve doesn’t sound right to me, but now I kinda want to try and introduce it into the lexicon. I’m still disappointed that the past tense of jive isn’t jove. Real missed opportunity there, English language.

    However, it turns out the speakers are quite reluctant to actually do that; they first make sure they don’t remember having heard the form and can’t avoid it.
    Could be due to minimal word requirements? If some bare roots were too small and there weren’t any regular ways of augmenting them in their ‘citation form’ because they only ever show up in speech with a bunch of obligatory morphology, that might explain why native speakers don’t ‘like’ bare verb roots. That said, I know almost nothing about Iroquoian languages.

    And if you’re asked to pronounce {wugs} you have to retrieve the generalization you made when you were four and conclude that it’s /wugz/ (not just underlying /wug-s/).
    There’s a functionalist theory of phonology that defines its primitives in a very similar way, all analogies between surface forms. IIRC it starts to run into trouble with things like reduplication and less obvious types of feature spreading though, and a lot of the predictions it makes about frequency effects don’t necessarily hold up, but it’s been a good while since I read anything about it. I’ve probably got some papers or at least a citation lying around somewhere…

  117. marie-lucie says:

    And if you’re asked to pronounce {wugs} you have to retrieve the generalization you made when you were four and conclude that it’s /wugz/ (not just underlying /wug-s/).

    The generalization a child makes at the age of four or so is to add a sibilant to a noun to make it plural. What you are referring to here is not “overgeneraliation”, it is a regular morphophonemic adaptation of the sibilant to the end of the word you attach it to. This is done automatically, it is not a choice between methods of plural formation. Thus, saying gooses instead of geese would be a true case of overgeneralization on the part of the child, but adding [iz] rather than [s] is part of the phonetically-based morphophonemics.

  118. @marie-lucie, that’s exactly what I’m trying to question here. How do we _know_ that we aren’t storing the result of the morphophonemic process, even though it’s quite predictable?

  119. It’s certainly testable: it should be slower to re-derive a form than to remember it. I have no idea if psycholinguists have done the work for this, though.

    The OED2 (1912) lists throve and thriven as the historic strong preterite and past participle of thrive, though of course it lists the regular forms too. Indeed, the oldest citation, to the Ormulum at the beginning of the 13C, is for the form thriven. Thrived is by no means a recent innovation, however, being first cited in 1614.

    Personally, I have no problem with recognizing throve as still part of the language, but I probably wouldn’t say it. Thriven I consider totally obsolete. YMMV.

  120. it should be slower to re-derive a form than to remember it

    But the usual story, as explained to laymen, is that the “[automatic] regular morphophonemic adaptation” happens in a custom-built part of the brain beyond the reach of introspection — but flexible enough that it can somehow be parametrized with a different ruleset for each language. Presumably so that fewer forms have to be stored in the lexicon, and that to me implies that it should be faster.

    It would be very interesting to see brain activity patterns for native speakers producing ‘hats’ vs ‘dogs’ vs ‘wugs’ vs ‘leaves’ vs ‘dwarves’.

  121. marie-lucie says:

    ‘hats’ vs ‘dogs’ vs ‘wugs’ vs ‘leaves’ vs ‘dwarves’.

    Lars: Try saying hat[z] or dog[s]. It is very difficult to avoid voicing assimilation of the suffix to the word-final consonant. This is a natural phonetic process, not a morphological one.

    Leave[z] and dwarve[z] are different since there is a change in the stem from /f/ to /v/ (the result of an ancient allophonic process at a time when the suffix did include a vowel) before adding the suffix, pronounced [z] to agree in voicing with stem-final /v/. This is true even if dwarves is an analogical reformation. The alternative is leafs and dwarfs, which do exist, with voiceless [s] after voiceless /f/, as in chiefs or staffs.

  122. Note that we now have three separate nouns: staff pl. staffs ‘people working for an organization’, staff pl. staves ‘lines on which music is written’, and stave pl. staves ‘fighting stick’, though some people use the second noun for this meaning also.

  123. @M-L, that still leaves the question why some languages allow certain types of feature spread (for instance) and others don’t.

    Sweden Swedish has progressive spreading of +retroflex from /r/ through any number of coronal segments, even over word boundaries. Finland Swedish, which has the same /r/, does not. If it’s natural and automatic, an unavoidable consequence of the way our vocal tracts work — what then do Swedish-speaking Finns do to keep it from happening? If it isn’t, what do Swedish-speaking Swedes do to make it happen?

    (It’s a bugger to learn as an adult, by the way. I’ve been sorely tempted to assume a fake Finland-Swedish accent).

    So yes, some phonological sequences are harder to produce than others — and I would not be surprised if it turned out that the shortcuts taken in allegro speech are more or less universal — but some language standards insist on the ‘easier’ choice, and some insist on the harder, and it seems that people learn to produce the expected standard forms without much trouble.

    Where do they store the information that allows them to do that?

  124. Presumably Finnish-speaking Finns couldn’t do it, and Swedish-speaking Finns inherited the inability from them. Similarly, Finland-Swedish is tone group 0 because Finnish is toneless.

  125. Stefan Holm says:

    How far haven’t we driven semantically apart, John? None of your three meanings of ‘staff’ applies to the Swedish cognate ‘stav’. For us it means pole as in pole jump (stavhopp) – the noble art in which I believe Sergey Bubka and Yelena Isinbayeva are still the world record holders. Close enough to ‘fighting stick’ though. ‘Pepople working for an organization’ is in Swedish ‘stab’ – an obvious borrowing from German. (During my military service an eon ago a was a ‘stabsunderofficer’).

    And Lars – what’s the big deal? That ‘rd’, ‘rl’, ‘rn’, ‘rs’ and ‘rt’ in dialects with a trilled ‘r’ developes into supradental/retroflexive varieties is basically the common phenomenon of assimilation. It has occurred in most dialects of English with Scottish (noRth of the boRder) as a clear exception. It follows a general rule: Peripheral and linguistically isolated areas (like Finland and Scotland) tend to be more conservative while central ones (exposed to influence from foreigners) tend to be more changeable in terms of phonetic simplification. Remember, that in southern Sweden, where the guttural are is used, no assimilation is heard.

    So, to this meta-thread – there are universal chomskian linguistic rules – fonetic, morphologic, gramatical and semantic. But they are all the time under attack by people who don’t know them.

  126. OK, we have two statements here:

    “Forward assimilation of +retroflex is a natural, general, universal phenomenon, and doesn’t need an explanation.”

    “Finns can’t do it, and therefore it isn’t in Finland Swedish.”

    How can both be true? That’s what I’ve been trying to ask. Are Finns genetically different from Swedes?

    Language standards differ in the types of assimilation present, and there must be some mechanism that allows most people to produce forms with exactly the assimilations that are accepted by the standard they are aspiring to, and no others.

    Can we agree on that? Or exactly where am I wrong?

  127. Stefan Holm says:

    You’re not wrong. Speakers of Swedish in Stockholm and other central parts of the kingdom were constantly exposed to foreigners, beginning with the Hansa League and onwards. Nobody however ever cared about the colony (it was a colony!) in the northeastern Baltic region. So they were never exposed to foreign pronunciational habits.

    Nothing of this could be more inforced than the liberation of Finland from Sweden in 1809. A famous statement from a member of the (Swedish speaking) upper class in the 19th c. is: Russians we don’t want to be, Swedes we can’t be – so let ur be Finns!. It took another one hundred years or so until the real Finns (the Fenno-ugrian speaking majority of the population) came into rule of their own country. I think that the conservatism among my countrymen in Finland has something to do with this historic events.

  128. David Marjanović says:

    Lars: Try saying hat[z] or dog[s].

    …But perhaps not in isolation. While English doesn’t have word-final devoicing (…anymore), it does have utterance-final devoicing; in dogs before a pause, the voice usually gives out at some point during the g.

  129. Trond Engen says:

    Retroflex assimilation in Norwegian. This distribution has very little to do with foreign influence (note the southern tip with its long-lasting ties to Holland and Hanseatic Bergen on the western coast). If anything, it correlates with the realization of word stress as a high (western/northern dialects or low (eastern/central dialects) pitch.

  130. George Gibbard says:

    DM, I’m pretty sure the voicing gives out, if at all, during the /z/ for me. In fact I think I’m more likely to devoice initial than final /g/, etc. (e.g. “good dog”) which may be counter to a universal tendency. (long genuinely voiced d, by the way.)

  131. George Gibbard says:

    The US TV show Saturday Night Live used to have sketches about people from Chicago who were fans of the football team “Da Bear[s]”, with a voiceless plural marker (before a pause). The effect to me was that their pronunciation was decidedly not normal. My stepmother’s friends from Chicago certainly don’t talk that way (including her best friend whose father was from Vienna). However, it’s possible that this occurs in Chicago varieties with more German/Eastern European influence.

  132. George Gibbard says:

    It used to be said that Chicago was the second-largest city in Poland.

  133. George Gibbard says:

    Sort of relatedly, I once knew a girl from Omsk in Siberia, who I would try to practice my Russian with. Once I said a sentence which ended with the word ‘nail’ (gvozdʲ). I had been taught about Russian final devoicing, so I said [gvɔstʲ]. But she didn’t understand me, so I said “nail”, to which she said, “oh, [gvɔztʲ]!”

  134. David Marjanović says:

    I think I’m more likely to devoice initial than final /g/, etc. (e.g. “good dog”) which may be counter to a universal tendency.

    I think that’s a universal tendency in languages with aspirated consonants: voice is necessary for the contrast at the ends of words, where aspiration tends to be reduced or lost, but not necessary at the beginnings. Initial voicing of plosives is generally shaky in English, perhaps a bit less shaky in northern German (where aspiration seems to be a bit weaker), and completely lost in Icelandic (where aspiration is hardcore).

    (long genuinely voiced d, by the way.)

    A very salient feature of English for me – those kinds of German that have voiced plosives have syllable-final, not word-final, devoicing: assimilatory devoicing goes forward, not backward (Hausbau “house construction” has [sb̥], never [zb]), and I’ve heard Sydney [ˈzɪtni] and Simbabwe [zɪmˈbap͡vɛ].

    Polish, for what that’s worth, has word-final devoicing, so that might in principle explain Da Bear[s]. But what do I know about sportsball or about Chicago (which is also the Burgenland‘s largest city, among no doubt many other such titles).

  135. Trond Engen says:

    To be precise, my linked map shows retroflex l, not assimilation, but regressive assimilation is a feature in most or all of that area.

  136. NYC must be the largest city in Israel by a factor of more than two, then. (The NYC metropolitan area, however, is second to Tel Aviv’s in the number of Jews.)

  137. I note that I never did state a counterexample to the proposed rule of English that nouns ending in a long syllables with [f] as a coda pluralize with [vz]. However, one does exist. Strife is a mass noun and so has no plural form. But the derivative loosestrifeis the name of two species of flowers (Lysimachia vulgaris, yellow loosestrife and Lythrum Salicaria, purple loosestrife). As such, it is a count noun and has the regular plural loosestrifes, though there’s clearly some pressure to say loosestrife plants/flowers instead. The English form is a mistranslation of lysi-machia, says the OED.

  138. How about goofs?

  139. @John Cowan: I think the plural “strifes” is merely odd, not nonexistent. Google Ngrams has it declining more than tenfold since 1850.

  140. @JC: Didn’t you mention oafs and reliefs, though? There’s also fifes.

  141. Rodger C says:

    Tolkien would refer to oaves, but he was playing on its etymological identity with elves.

  142. Per the OED3, oaves appears as late as 1858, but oafs (spelled “ophs”) as early as 1640, so there’s no telling if the 1858 instance is an archaism or an analogy. The variant ouphs, with the MOUTH vowel, is used by Shakespeare in The Merry Wives of Windsor; the OED does not record the entirely obsolete variant auf(e) in the plural. All these words were first specialized to mean ‘elf’s child, changeling’ before shifting to their modern meanings ‘fool, clumsy person, rude and boorish person’.

Speak Your Mind

*