MCWHORTER ON PROTO-WORLD.

John McWhorter is a favorite here at LH and has come up repeatedly in my posts (most recently here); I was happy just now to run across an online course guide (pdf) of his lectures on “The Story of Human Language” for the Teaching Company (you can access the three parts separately here). The most interesting aspect to me was his take on the idea that we can use surviving languages, and the proto-languages we can reconstruct from them, to see back 100,000 or more years to find bits and pieces of the very first human language, conventionally called Proto-World. I personally consider this notion (associated with the names of Joseph Greenberg and Merritt Ruhlen) ludicrous on the face of it, appealing to those who are so enthusiastic about piercing the veil of time that they are willing to overlook the glaring problems (the prevalence of coincidence and the inevitability of sound change rendering forms unrecognizable after thousands of years, for two), but then I’m one of those stodgy Indo-Europeanists the partisans of the theory love to mock. McWhorter has a more nuanced take on it; while rejecting the theory in its strong form, he emphasizes the likelihood that there are regional groupings that can’t be strictly reconstructed but are nevertheless real:

IV. Final verdict.
A. Ruhlen’s point that comparative reconstruction is not the only way to show that languages have a common ancestor is valid in itself. He observes that linguists posited the Indo-European group long before Proto-Indo-European itself had been worked out by working backward from the languages. The similarities between language families are close enough that his point is likely valid for mega-groups, such as Amerind and Eurasiatic.
B. A question still remains, however, as to how realistic even this approach is for Proto-World. The issues could be resolved as more proto-languages are reconstructed, although work of this kind is done increasingly less by modern linguists, and for reasons we will see in later lectures, it may be entirely impossible to reconstruct protolanguages for many families.

He gives some great examples; to illustrate the point about sound change, for instance, he says: “Proto-Algonquian words have been recovered through comparative reconstruction; the word for winter, for example, was peponwi. But the word in Cheyenne that has developed from this root is aa’—because of gradual changes over just 1,500 years.” (He gives all the intermediate stages as well.) And he has very useful bibliographies after each section, with brief descriptions of each item, for instance:

Chafe, Wallace, and Jane Danielewicz. “Properties of Spoken and Written Language,” in Comprehending Oral and Written Language, ed. by Rosalind Horowitz and S. Jay Samuels, pp. 83–112. New York: Academic Press, 1987. This article illuminates in clear language the differences—often shocking—between how we actually talk and how language is artificially spruced up in even casual writing, showing that spoken language, despite its raggedness, has structure of its own.

His otherwise admirable populism leads him to give too much credence to people like Bill Bryson, but that’s a minor problem. This is a good resource to have.

Comments

  1. Steve Reilly says:

    “Greenberg has a more nuanced take on it;”
    Do you mean McWhorter?

  2. marie-lucie says:

    I think that McWhorter’s course must be based on his book The Power of Babel, which I greatly enjoyed in spite of some errors, and have recommended to non-linguist friends, because his enthusiasm and drive could spur a lot of interest in the structures, origins and development of languages. But McW is not himself a historical or comparative linguist (a person working in that area rather than seeing it from afar). For instance, when he brings up reconstructed peponwi and actual aa, even if he gives the reconstructed intermediate stages, he does not make it obvious that each of these intermediate stages is justifed by sound correspondences occurring in large numbers of words across the Algonquian family.
    About the conclusions in A:
    Ruhlen’s point that comparative reconstruction is not the only way to show that languages have a common ancestor is valid in itself.
    Agreed, but comparative reconstruction is a shortcut phrase misunderstood by many people including some historical linguists (especially those working in languages with a long tradition of literacy, like Romance specialists who can draw on centuries of Latin writing as well as of writing in their language(s) of interest, showing how the languages changed through the centuries). It should be “reconstruction of the common ancestor using the ‘comparative method’ of proto-language reconstruction”: obviously, before reconstructing what must have been the common ancestor of two or more related languages, one should first have good reason to believe that those languages are indeed related. What the attempted reconstruction of the common ancestor, using the comparative techniques first perfected for Indo-European, does, is to confirm that the languages are indeed related, and more specifically, to allow linguists to determine which words of those languages are genetically related to each other and which ones are not. For instance, it can be proven not only that Latin, English, French and Armenian are related, and that the words pater, father, père, hayr are related to each other (and descend from the ancestral Proto-Indo-European), but that coffee, café, kahvi are not related to each other and their similarity comes from the fact that most of them are borrowed from each other and ultimately from a non-Indo-European word. It should be remembered that it took dozens of scholars more than a hundred years to work out most of the structure and vocabulary of Proto-Indo-European (and there are still points of contention on details), and that thus far there has been no other reconstruction of comparable scope (Finno-Ugric is a much smaller and more compact family, and although Malayo-Polynesian consists of hundreds of languages, most of them are very close to each other, more like dialects than separate languages). If one starts comparison of two or more languages by trying to reconstruct a common ancestor without having a firm basis for considering those languages related, the result is unlikely to be conclusive. This has been a problem in trying to determine relationships between, for instance, the native languages of the Americas.
    He observes that linguists posited the Indo-European group long before Proto-Indo-European itself had been worked out by working backward from the languages.
    The languages which were originally presumed to be related (= descended from a common ancestor) were Latin, Greek and Sanskrit, because of the obvious and striking similarities in their noun and verb paradigms (complex declensions and conjugations) in addition to the similarities in their vocabularies which had been noticed quite a bit earlier but could not be satisfactorily explained. Languages which were more divergent in grammatical structure, vocabulary and/or sound systems were not immediately recognized as belonging to the same group, but added later after more detailed comparison of morphology and sound correspondences had been done (such were Armenian and the Celtic languages).
    The similarities between language families are close enough that his point is likely valid for mega-groups, such as Amerind and Eurasiatic.
    How to define “similarities between language families” is a sticky point. Ruhlen and his master Greenberg used techniques which most other linguists (myself included) consider extremely sloppy to define those “similarities”. The proposed mega-group “Amerind” was put together by Greenberg not through detailed, painstaking comparison (his vocabulary lists are riddled with errors of all kinds) but by accepting as valid every proposal that had been made for classifying Amerindian languages outside of the Eskimo and Athabaskan languages (which belong to families with very distinct structures, about which there has never been any controversy). Ruhlen’s books (one for linguists, one for the general public, who is led to believe that untrained persons can do better than the specialists) give an Amerind “family tree” which includes subgroupings which had already been thoroughly discredited. The classification of Native American languages (none of them with a long written tradition) is far from being established. The currently accepted classification differs minimally from that established by about 1865. Several reclassifications have been proposed, notably by Sapir and Swadesh, although none of them have been accepted and some of them contradict each other. It is virtually certain that there must be groups intermediate in size and complexity between the still official atomistic classification and the all-encompassing Amerind, but they will require much more painstaking work, as well as better training in comparative and reconstructive methods for those willing to undertake it (but the field has not been much encouraged in the past few decades). .

  3. Do you mean McWhorter?
    Thanks, I’ve fixed it. Well spotted!

  4. And I agree, naturally, with everything m-l says.

  5. John Emerson says:

    Distinguishing the Inuit (etc.) and Athabaskan languages from the rest of them sounds like a real accomplishm,ent, but dumping the rest in a residual class seems like the easy way out.
    My take on the proto-language going back 10,000+ years is that it doesn’t seem to me that with a couple of cycles of pidginization, creolization and then elaboration, plus a few migrations and the development of areal effects, that the data would end up so thoroughly churned that too much information would be lost. The IE migrations are very recent in prehistoric terms, and as it is, it seems to me that if they were all unwritten languages with no ancient records, and if most of the intermediate languages had been lost, we might not guess that, e.g., Persian and English are related.

  6. This sentence:

    The most interesting aspect to me was his take on the idea that we can use surviving languages, and the proto-languages we can reconstruct from them, to see back 100,000 or more years to find bits and pieces of the very first human language, conventionally called Proto-World.

    seems to presuppose that all surviving languages descend from the first human language. Which seems plausible enough. But then doesn’t that make this sentence:

    Ruhlen’s point that comparative reconstruction is not the only way to show that languages have a common ancestor is valid in itself.

    a bit meaningless? I mean, if we’re already presupposing that all languages have a common ancestor, then it’s not meaningful to show that any two languages do. And it seems that you do need comparative reconstruction if you want to say anything more meaningful (such as determining, between languages A, B, and C, which two have the more recent common ancestor, or such as — as Marie-Lucie mentions — determining which words are inherited cognates).

    Or am I missing the point?

  7. marie-lucie says:

    JE: Distinguishing the Inuit (etc.) and Athabaskan languages from the rest of them sounds like a real accomplishm,ent
    It might have been if those languages had been scattered all over the map of the Americas, and if the scattering had happened so long ago that the languages would have gotten mixed up with their neighbours, but Inuktitut and its relatives (such as Inupiaq and Yupik, all members of the “Eskimo(id)” family with the same structure and much common vocab) are pretty much strung all around the same circumpolar area, not mixed in with other languages, and the Athapaskan languages, although not as homogeneous geographically, are also very distinctive in structure and have much vocabulary in common. This is why their relatedness was recognized very early, in spite of some internal diversity within the family.
    but dumping the rest in a residual class seems like the easy way out.
    Exactly. This dumping has been hailed as a breakthrough only by people unfamiliar with the languages in question, or who thought that “since Greenberg did a good job classifying the languages of Africa [decades before], he must have done an equally good job in the Americas”, something that cannot be taken for granted. I don’t know about Africa, but he did a terrible job in the Americas. His latest offering was “Eurasiatic”, in which he included not only the languages of Europe but the Eskimo ones. Beautiful in the abstract, but not when you look at the details.
    Ran: I agree with you, and you are right about the matter of subclassification (which of A, B, C have the more recent common ancestor). I simplified matters for my comment above.
    About “Proto-World”, one of the problems of its proponents is that they are trying to “reconstruct words”, apparently not realizing that words must belong to a language which has a given structure. For instance, most Indo-European words are based on a CVC root, with optional prefixes and/or suffixes, and one would think that words ancestral to IE would not have root words larger than IE roots, yet some words are “reconstructed” (apparently to an eyeballing technique) which are much longer. Those words or roots must have been pronounced, and there have been studies of “universals” of sound patterns, which have discovered for instance that speech sounds are not distributed equally, and languages which use a certain sound must also have another more common sound (eg t or k are much more common than th or q, so a language which has th must also have t), or conversely, a language which does NOT include a certain sound, for instance k, is most unlikely to have the less common sound q. Proto-World fans don’t seem to take such things into consideration.

  8. John Emerson says:

    As I understand, The “Proto” method is to choose the most common items of vocabulary (water, man, hand, etc, — don’t know the list) and see if a common proto-form can be found between two groups. It doesn’t sound impossible to me, except for the reason I gave above.

  9. Bill Walderman says:

    “Ruhlen’s point that comparative reconstruction is not the only way to show that languages have a common ancestor is valid in itself.”
    What are the other ways? It doesn’t strike me that anyone really “showed” that the IE languages are related before Bopp, although people may have speculated or voiced suspicions that the relationships existed.
    By the way, does anyone know what is going on with Edward Vajda’s hypothesis that the Yeniseic languages of central Siberia are genetically related to the Na-Dene languages of North America? When he presented this hypothesis about two years ago, it seemed to be taken seriously by at least some credible specialists in North American languages–not that it was taken as proven, but there seemed to be an acknowledgment that he had presented some plausible evidence. But I haven’t seen any follow-up. Is it simply that he’s now engaged in trying to work out the correspondences in more detail and nothing has been published yet?

  10. Ruhlen’s right that IE was posited long before it was proved, but then so were Ural-Altaic and Sino-Tai and lots of other groupings that were later invalidated. You can posit anything you want, and there’s no evidence that Greenberg-style mass comparison is any more reliable than plain old intuition as a source of fruitful hypotheses.
    In addition, even granting that all extant languages have a common ancestor (which is a separate question from whether it can be proved), it would not follow that the common ancestor was the first language ever spoken, any more than Mitochondrial Eve was the first female Homo sapiens who ever existed, merely because she is our nearest common ancestor (through female lines exclusively, to be sure, but languages have only one parent, not two).
    John Emerson: If we didn’t have records of their earlier stages, it would be obvious that the West Germanic languages (English, Scots, Frisian, Dutch, Low German, High German, Yiddish) are related, but we’d never be able to figure out exactly how because of all the cross-borrowing. The experiment has been tried by Don Ringe & Co., with hopelessly inconsistent results.
    Bill Walderman: Vajda’s paper isn’t due to be formally published until this March (as part of a single-topic journal article), and until then we can’t expect much in the way of a response. Furthermore, the number of scholars who are expert on both Yeniseian and Na-Dene can probably be counted on my fingers, so what’s probably going to happen is that historical linguists will accept the result because the result looks plausible and the methodology is convincing.
    Vajda has a lot more time for the utility of mass comparison as a discovery technique (as opposed to a proof technique) than the average historical linguist, but even he points out that although Ruhlen’s 1998 paper did get 8 cognates right in his version of Dene-Yeniseian, he also trawled up 28 chance resemblances. Not a very impressive hit rate, I’d say; and so much for Ruhlen’s remark “Two language families might share one or two accidental resemblances, but they would not share 36, so the only plausible explanation for these resemblances is common origin.” Ha. Ha ha.

  11. John Cowan: You can *posit* anything you want, and there’s no evidence that Greenberg-style mass comparison is any more reliable than plain old intuition as a source of fruitful hypotheses.
    That’s what I would have thought. marie-lucie writes: “obviously, before reconstructing what must have been the common ancestor of two or more related languages, one should first have good reason to believe that those languages are indeed related.”, but people will differ about what counts as a good reason.
    Proto-World
    There’s something dubious about this slide from proto-languages to proto-world. What is the point of introducing a realist theoretical framework here ? What does “world” have to do with the internal processing patterns (syntactic, phonemic) of specific languages or groups that are usually the object of linguistic studies ? Are we being asked once again to believe that words “refer” to “reality” ? Which “reality” ? Sub-atomic particle physics, or Morris dancing ? And even if we were being asked to believe such a thing, what could it contribute to speculation about the existence of a single Ur-Proto-Language ?
    but languages have only one parent, not two.
    I’m not sure what that means. I thought that creoles can have many “parents”. Is your claim an empirical fact, or a theoretical principle ? Of course these two things do not exist independently of each other. An empirical fact is a fact only within the observational and theoretical framework which makes it observable and expressible. And such a framework is usually “grounded” in what one takes to be empirical facts.
    To put it more generally, what is the theoretical background for a claim that all instances (languages, insects) of a certain type (language, insect) must be derived from a single instance (proto-language, proto-insect) of that type ? Is it absolutely inconceivable that – to put it rather wildly – sense and sound are separate things that sometimes have melded in certain ways to form languages ? Then languages would have two parents: sense and sound, semantics and phonemics.
    Otherwise, I don’t see how to account for the fact that one can make blog comments that can be read out loud, but seem to make no sense.

  12. The conceptual maneuver in “proto-world” can be made more explicit. The concept assumes we believe that there is “one reality”, so we must believe that there once was “one language” to deal with it. Talk about cheap tricks !

  13. Why is it that linguists are so “increasingly less” concerned with (orthodox) historical reconstructions of language families anyway?
    Surely it is one of the few demonstrably worthwhile activities the discipline has encompassed?

  14. I mean, if we’re already presupposing that all languages have a common ancestor
    The Proto-World people presuppose it; I certainly don’t. It seems to me a question, like that of the presence or otherwise of life in a galaxy beyond the cosmological horizon, that is impossible to decide.
    I’m not sure what that means.
    With respect, you need to study some historical linguistics; you can’t really expect to understand the principles and methods of a complex field based on general knowledge and a few comments here. Short answer: it’s complicated.
    Surely it is one of the few demonstrably worthwhile activities the discipline has encompassed?
    Yes, along with documentation of threatened languages. One of the reasons for my visceral dislike of Chomsky and his acolytes is their demolition of both those pillars of the field in favor of navel-gazing (excuse me, “theory”).

  15. Hat, what I said I didn’t understand was John Cowan’s remark

    but languages have only one parent, not two

    Now that doesn’t sound like a difficult statement. In fact it very much resembles

    presupposing that all languages have a common ancestor

    which you say you don’t hold with.
    John’s remark doesn’t sound difficult, only ambiguous. But whatever it means, I certainly don’t expect to be able to understand, without further study, the reasons why it holds.
    I see at least three ways to interpret John’s remark:
    1] By definition, each language is something that has only one parent
    2] As empirically determined, each language is something that has only one parent
    3] As empirically determined, the single parent that each language has (by 1] or ]) ) is the same parent for all
    I don’t know what might be meant by “parent” in each case. Is this where deep study is required to acquire understanding ?

  16. linguist.in.hiding says:

    > Why is it that linguists are so “increasingly less” concerned with (orthodox) historical reconstructions of language families anyway?
    Well, there are many answers to that. First of all, I think your use of orthodox here seems to indicate that you consider this some kind of dogma. I also guess that you deny that. Anyway, just look at the discussion here. If there were generally valid alternatives, we would already be using them. Now, some supposed reasons:
    - There is much more to linguistics than that.
    - The rise of synchronic linguistics has left diachronic linguistics in its shadow.
    - There are as much, or even more, studies in diachronic linguistics as before, you just don’t notice it. Are you an expert in the field? Do you read Journal of Indo-European Studies or Indogermanische Forschungen? Have you made comparisons of students and staff and their amount of publications of the relevant institutions today vrs. before?
    - Even within the relevant fields there are few spectacular show cases. The Yeniseic and Na-Dene thingie is an exception. If it turns out to be quite valid, great. The non-spectacular work “just doesn’t matter”.
    - Last, but not least, money. There are budget cuts and discontinuations of departments and institutions everywhere. This has gotta have an impact.
    > Surely it is one of the few demonstrably worthwhile activities the discipline has encompassed?
    Compared to what? But yes, I think it is a worthwhile activity. To the outside world, though, these activities are nothing like a cure for cancer. And there is nothing you can say when people are poisoned by a utility fetishism. Anyway, it also does not help the field that a whole lot of cranks present their “theories” continuously. Why, just a week ago I noticed someone presenting (once again) “evidence” that the Hungarians were the Huns…
    >> Surely it is one of the few demonstrably worthwhile activities the discipline has encompassed?
    > Yes, along with documentation of threatened languages. One of the reasons for my visceral dislike of Chomsky and his acolytes is their demolition of both those pillars of the field in favor of navel-gazing (excuse me, “theory”).
    I should not say this but other things have also led many linguists to lose their focus. Some have gone to make linguistics that has no linguistics in it. The evaluation of language skills and self-image through language or what have you have become issues of linguistics. When there is no phonology, morphology, syntax or semantics (you might add something more, like pragmatics) involved shouldn’t this line of work be done under sociology or anthropology (I just have never quite understood the predominantly american idea of anthropological linguistics) or whatever? That said, I wholeheartedly agree. Do we need yet another theory of the French liaison or more studies of VOT in English? But no, I already hear the voices telling me that your study of the use of comma in French is “just as important” as his studies in Ket morphology or her studies of Navajo verbs. Not that this hasn’t been discussed here or elsewhere before…

  17. I mean, if we’re already presupposing that all languages have a common ancestor”
    “The Proto-World people presuppose it; I certainly don’t. It seems to me a question, like that of the presence or otherwise of life in a galaxy beyond the cosmological horizon, that is impossible to decide.”
    It seems to me to be reasonable to try to work out if language was invented/developed just once, or more than once; and reasonable to suggest that if we found a number of words in a sufficiently large number of language families that could all be shown to be from the same ultimate reconstructable root words, that would be evidence towards all languages having a single origin. But given the way languages change, that is likely to be a task not unlike trying to work out the original order of a deck of cards after the deck has been shuffled for 24 hours.

  18. John Emerson says:

    You can *posit* anything you want, and there’s no evidence that Greenberg-style mass comparison is any more reliable than plain old intuition as a source of fruitful hypotheses.
    This doesn’t strike me as a killing argument at all. Hypotheses come from somewhere,and they’re always unproven until they’re proven, by definition, and there’s usually a long period when they’re in limbo. A lot of the oppositional rhetoric seems to be offended by the very idea of what Ruhlen and Greenberg are doing. What little I’ve read seems to be working up the ladder pretty reasonably, finding commonalities between Finno-Ugric or Caucasian languages and IE languages. I’m more a skeptic than otherwise, but I’m glad that someone’s doing that stuff,and it still seems possible that it will come up with something interesting, though in such cases the original hypothesis is hardly ever confirmed unchanged.

  19. In fact it very much resembles
      presupposing that all languages have a common ancestor
    Actually, it doesn’t at all. The statement “languages have only one parent, not two” means that, say, English is descended from Old English (which is itself descended from Proto-Germanic, and that from PIE). It has a lot of input from French, but it does not make sense (in the terms commonly accepted by historical linguists) to say that both Old English and French are its parents. (In other words, your 1 is correct, although there are fringe cases where it might make sense to say that a language had two parents; I think I’ve written about them somewhere, but I can’t think how to use the search box to find them.)
    It seems to me to be reasonable to try to work out if language was invented/developed just once, or more than once
    Not sure what you mean. It’s an interesting subject to speculate about, but there’s no way to “work it out,” any more than there is to, using your own example, “work out the original order of a deck of cards after the deck has been shuffled for 24 hours.” In other words, you seem to have worked your way around to agreeing with me.
    What little I’ve read seems to be working up the ladder pretty reasonably, finding commonalities between Finno-Ugric or Caucasian languages and IE languages.
    Yes, and as McWhorter says (and I agree), that’s all very well, but it doesn’t make the idea of Proto-World any less crackpot, which is why all the hating on Ruhlen and Greenberg.

  20. I should not say this
    Oh, but I’m glad you did! Feel free to hide out here any time you want; you’re a linguist after my own heart.

  21. David Marjanović says:

    I’m not sure what that means. I thought that creoles can have many “parents”.

    Most don’t. A creole arises when a pidgin is passed on to children as their native language and they fill in whatever gaps there were in the vocabulary or grammar; and even most pidgins have a single parent.
    Most creoles are of less mixed origins than English.

    To put it more generally, what is the theoretical background for a claim that all instances (languages, insects) of a certain type (language, insect) must be derived from a single instance (proto-language, proto-insect) of that type ?

    If you think of insects or languages in terms of “type” and “instance”, you’re doing it hopelessly wrong. Typology is so 18th century. :-)
    Insecta is these days defined in some way like “the most recent common ancestor of bristletails, silverfish, dragonflies and flies, plus all its descendants”.
    There is no type. The actually observed diversity is the thing. Darwin, Mendel, Lotka & Volterra, Fisher, Gould – not Plato.

    Is it absolutely inconceivable that – to put it rather wildly – sense and sound are separate things that sometimes have melded in certain ways to form languages ? Then languages would have two parents: sense and sound, semantics and phonemics.

    It is dead obvious that all known life shares as single common ancestor (which, as is also obvious, was not the first living being – just like how Proto-World can hardly have been the first language) because all known life shares lots and lots of features, from the use of a cell membrane consisting of certain kinds of lipids over the use of DNA to detailed similarities in all manner of genes; in short, all known living beings share lots of features that they don’t need to share (for any functional reason), so common descent is by far the most parsimonious explanation.
    With languages it’s not that obvious – very few “language universals” have been proposed that don’t follow straight from human anatomy or similar constraints, and very few actually are universal, so it’s not known whether a language fundamentally unlike all known ones can even exist for any extended period of time. (For instance, the Klingon sound system is absolutely unlike any on Earth in terms of which sounds it lacks despite the presence of others, but, if left alone among humans for a few generations, this sound system would shift to an entirely unremarkable one.)
    Now let me repeat my standard lamentation about the method used by Greenberg and Ruhlen (mass comparison = multilateral comparison = probably a few more synonyms): it is not a phylogenetic method, it is a phenetic method. It quantifies overall similarity instead of counting shared innovations. This leads to the same results only if there is little enough homoplasy (convergence, reversals, borrowing…) in the data set, and that is part of what a phylogenetic analysis is meant to test. (It’s also, of course, a rather unreasonable assumption for most languages.) So, it’s probably good for generating hypotheses, but not at all for testing them.
    Of course, by far not enough testing is being done, basically for lack of money.

  22. m-l: “increasingly less” was McWhorter’s phrase, not mine, which is one of the reasons it was in quotes. If he is mistaken so much the better.
    Meanwhile, “worthwhile” does not (at least to me) mean the same as useful.
    What beliefs you might have attributed to me on the basis of the term “orthodox” I can’t actually figure out, so I won’t bother denying having them, but I admit it wasn’t a particularly well-chosen adjective. “Good old-fashioned (except where modern techniques offer clear advantages)” would have been better.

  23. Charles Perry says:

    I intend to start a rumor than the Proto-World language was Na’vi.

  24. Charles Perry says:

    Oops. Did I say “start a rumor”? I meant “host a special on a cable science channel.”

  25. Grumbly, “Proto-World” simply means “proto-language for the (putative) family consisting of all the languages of the world”, and has no such implications as you give it. As for “languages have only one parent”, it is an empirical generalization over the languages we know, and as such has a handful of exceptions (less than 1% of currently spoken languages) of fairly well-known types.
    John Emerson: The problem with Greenberg et al. is not with what they are trying to do, but with how they are trying to do it. First of all, they are notoriously using incorrect data and acting like that doesn’t matter. You can’t just grab words out of dictionaries with no clue how they got into the dictionaries, or whether they really have the meanings you think they have. Second, they are using naive comparison, which is fine if you are trying to figure out which languages might be related, but is entirely unsuitable for (as Marie-Lucie says) confirming that relationship. The first idea of fish is that all the animals that swim are fish: this is later refined by removing whales, seals, etc., and finally confirmed by morphological studies and eventually DNA. Greenberg stops at the first step and announces that the work is done.
    Hat, I think you are overstating the case: descriptive linguistics of even well-known languages like English is a worthwhile activity. Consider all the work that went into the Cambridge Grammar of the English Language. Though not theory-driven, it is certainly theory-informed. Also, I don’t agree that Proto-World is crackpot in principle, although certainly Greenberg’s approach to it is wrong in principle.
    David M: Mass comparison is not even a phenetic method, because it gets the characters wrong. For proper linguistic cladistics based on vocabulary, only lexical innovations count as separate characters: mere sound-change does not create a new character. Greenberg’s method groups fruit with wax fruit. (On the phonological side, a single sound-change is too homoplasic to be counted as a character, though a sequence of sound-changes like Grimm’s Law can safely be treated as a character.)

  26. John Emerson says:

    Well, we all know what proto-world was. Dravidian. It’s just a matter of working out the deviations of the other language families.

  27. marie-lucie says:

    I agree with David M::
    Now let me repeat my standard lamentation about the method used by Greenberg and Ruhlen (mass comparison = multilateral comparison = probably a few more synonyms): it is not a phylogenetic method, it is a phenetic method. It quantifies overall similarity instead of counting shared innovations. This leads to the same results only if there is little enough homoplasy (convergence, reversals, borrowing…) in the data set, and that is part of what a phylogenetic analysis is meant to test. (It’s also, of course, a rather unreasonable assumption for most languages.) So, it’s probably good for generating hypotheses, but not at all for testing them. [
    But Greenberg and Ruhlen seemed to think that “generating a hypothesis” is the same as “confirming its validity”, a major scientific error.
    In more linguistic terms: comparison of vocabulary (of course, admitting that it is well and carefully done, which was not the case with Greenberg in the Americas) will work well only if the languages are closely related. But in that case, there will also be close resemblances of structure: nobody can deny that English, Dutch and German are related, not just from the resemblances – including regular sound correspondences – in their basic vocabulary, but more to the point, in how they make and adapt words: their processes of verb formation, for instance. This is the type of comparison on which shallow classification (eg, as Germanic languages) and proto-language reconstruction (eg of Proto-Germanic) can be done. But for classification and proto-language reconstruction at a deeper level (eg between the Germanic and Indo-Iranian languages, where one has to work much harder to find resemblances), simple comparison of vocabulary is not enough, and if there are a few strikingly resemblant words, they are likely to be due to coincidence (a famous example is English bad = Persian bad, which sound and mean the same), or to borrowing, either from each other or from the same other language (like words meaning coffee, for instance).
    before attempting proto-language reconstruction, one needs a good reason to think the languages are related (= likely to descend from the same proto-language)
    It is true that not everyone will agree on what qualifies as a good reason, but that reason is more likely to be in the realm of morphology (word structure) than of vocabulary, since in general morphology is the part of language that is most resistant to change: all IE languages, different as they are today, still use recognizable forms of the original verb “to be”. For another example, there are many French words which were adopted into English at different times for various historical and cultural reasons, but the formation of English verbs (including those of French origin) has not been affected by the importation of French verbs: to sauté or to debut take the ending “-ed” like to jump or to start, for instance. English verbs of whatever origin follow the Germanic type, while French verbs (including those of English origin such as bloguer) follow the Romance type, which derives from the Latin type. Where English vocabulary taken as a whole might suggest a Latin base for the language (and some non-historical linguists have suggested it), English verb-formation places English firmly in the Germanic family, and so does the formation of English noun-plurals (in both cases, a majority of words take suffixes, but a minority of them change their root vowels). Such parallel structural resemblances, occurring mostly in roots which correspond to each other in terms of sounds and general meaning, are vanishingly unlikely to be coincidences. This same reasoning was at the origin of Indo-European studies and Proto-Indo-European reconstruction, and later provided the basis for relating Hittite to the IE languages (either as another family within the group, or as a “sister” to PIE itself – opinions still seem to be divided about the exact type of relationship).
    Following these comparative principles, similar work has been done in other families such as Finno-Ugric and Semitic, and more recently Algonquian, Athabaskan and Mayan in the New World, although IE is still taken as the general model.
    But once “comparative grammars” of the various IE families were compiled (a major achievement of the XiXth century), most research in IE shifted to comparison of words, for instance in order to establish the common vocabulary of IE and to infer from this the type of society that must have spoken the language. So the work of comparison became more and more narrowly focused on words, and especially on the sounds of those words (eg “the development of Latin intervocalic -g- in Spanish dialect X”, for instance). This narrowing of focus was upsetting to Saussure, who wanted to remind his colleagues that after all they were dealing with entire languages, just like those spoken by them and around them. Besides the subsequent shift of focus of linguists’ interest to synchrony rather than diachrony, one consequence for historical linguistics has been a narrowing of interpretation of the comparative method, understood by many people as just comparison of vocabulary, with scant attention to morphology, even though close attention to morphology was indispensable in the early development of the method. This narrow focus is still the one pervading most comparative work in the languages of the Americas (which is also hampered by the lack of documentation available and the scant historical depth of the documentation that exists). Even if Greenberg claimed to be able to bypass the comparative method, his emphasis on vocabulary shows that he did not understand what the method was really about but did follow its most superficial definition.

  28. John Emerson says:

    First of all, they are notoriously using incorrect data and acting like that doesn’t matter.
    This is always and inevitably said about generalists by specialists. Often the method used is the simple heaping up of errors, without asking which errors are fatal and which will come out in the wash.

  29. marie-lucie says:

    des von bladet: you must have confused me with another commenter.

  30. descriptive linguistics of even well-known languages like English is a worthwhile activity
    Oh, sure, and I’m sorry if I gave the impression that I didn’t think it was. But when you say “descriptive linguistics,” I think of the Bloomfield school of grammatical description based on practical experience, not the theory-driven crap that replaced it. Being theory-informed is OK as long as you don’t let theory drive the car.
    I don’t agree that Proto-World is crackpot in principle
    I’m not sure we disagree. I’m not saying it’s crackpot “in principle”—in other words, if we had a time machine, it would be a perfectly sensible thing to investigate—just that to pretend to research it in the absence of a time machine (and therefore of useful evidence) is crackpottery.

  31. Don’t mind des, he has half his mind on Danish princessor and can’t pay full attention to the conversation.

  32. marie-lucie says:

    First of all, they are notoriously using incorrect data and acting like that doesn’t matter.
    Yes, they say that “errors cancel each other”. This may be true of measurement errors, where repeated measurements of the same item will have slight variations which can cancel each other to produce an average, but errors of fact no more cancel each other than spelling errors in a text cancel each other: they create the perception that the text is unreliable (at least in terms of spelling), and they can also lead to gross errors of interpretation in the case of homonyms.

  33. marie-lucie says:

    JC: The first idea of fish is that all the animals that swim are fish: this is later refined by removing whales, seals, etc., and finally confirmed by morphological studies and eventually DNA.
    I gave a historical linguistics lecture a few years ago entitled “Is the whale a fish?”, making those very points (great fun to do). For Greenberg and Ruhlen, the answer must be YES.

  34. It is true that not everyone will agree on what qualifies as a good reason, but that reason is more likely to be in the realm of morphology (word structure) than of vocabulary, since in general morphology is the part of language that is most resistant to change: all IE languages, different as they are today, still use recognizable forms of the original verb “to be”.

    But once “comparative grammars” of the various IE families were compiled (a major achievement of the XiXth century), most research in IE shifted to comparison of words, for instance in order to establish the common vocabulary of IE and to infer from this the type of society that must have spoken the language. So the work of comparison became more and more narrowly focused on words, and especially on the sounds of those words (eg “the development of Latin intervocalic -g- in Spanish dialect X”, for instance). This narrowing of focus was upsetting to Saussure, who wanted to remind his colleagues that after all they were dealing with entire languages, just like those spoken by them and around them. Besides the subsequent shift of focus of linguists’ interest to synchrony rather than diachrony, one consequence for historical linguistics has been a narrowing of interpretation of the comparative method, understood by many people as just comparison of vocabulary, with scant attention to morphology, even though close attention to morphology was indispensable in the early development of the method. This narrow focus is still the one pervading most comparative work in the languages of the Americas (which is also hampered by the lack of documentation available and the scant historical depth of the documentation that exists). Even if Greenberg claimed to be able to bypass the comparative method, his emphasis on vocabulary shows that he did not understand what the method was really about but did follow its most superficial definition.

    Again, absolute clarity from marie-lucie. I’m going on a hunger strike until such time as she collects all her contributions into a book.

  35. marie-lucie says:

    About Dene-Yeniseian:
    I heard Ed Vajda’s very first presentation of the idea some years ago, his presentation and workshop two years ago in Alaska with several top historical linguists present (all of whom praised his work) and his latest update last summer, in the presence of John Bengtson and George Starostin (who is continuing his father’s work in comparative Yeniseian). Vajda is extremely astute and hard-working and his work is being taken very seriously, even by some people who had dismissed it at first. The differences between his Yeniseian work and the Starostins’ mostly concern details of the proper reconstruction of Proto-Yeniseian, not the theory itself.
    JC: Vajda has a lot more time for the utility of mass comparison as a discovery technique (as opposed to a proof technique) than the average historical linguist,…
    I find it useful too for the same reason (I am a historical linguist, but probably not “average”, or rather not “mainstream”). “Mass comparison” does have its uses, and it does not have to be sloppily done.
    … but even he points out that although Ruhlen’s 1998 paper did get 8 cognates right in his version of Dene-Yeniseian, he also trawled up 28 chance resemblances. Not a very impressive hit rate, I’d say; and so much for Ruhlen’s remark “Two language families might share one or two accidental resemblances, but they would not share 36, so the only plausible explanation for these resemblances is common origin.” Ha. Ha ha.
    Yes, ha ha ha, because Ruhlen’s case rested ONLY on comparisons of individual words. And picking 36 words out of thousands existing in a language is hardly statistically significant.

  36. David Marjanović says:

    Hypotheses come from somewhere,and they’re always unproven until they’re proven, by definition, and there’s usually a long period when they’re in limbo.

    No, nothing is ever proven outside of math and formal logic. Only creationists and historical linguists ever use the term “unproven theory”.

  37. John Emerson says:

    Ruhlen’s argument is that certain very basic words are unlikely to be replaced entirely, but would only go through phonetic transformation and remain cognate. I forget what his list is. But it’s necessarily a short list, so questioning the statistical significance of a small number of words he uses is begging the question.

  38. m-l: Even without regard to comparative stability, morphological comparison is extremely powerful compared to simple lexical comparison: if we see that a regular plural noun suffix -j in one language corresponds to a regular plural noun suffix -g in another, we have at a single stroke established the correspondence between tens of thousands of words. But when morphological evidence is not available, phonological evidence alone can be sufficient for reconstruction (as in Proto-Tai, say).
    David: “Proved” in this context means “established to an extent that it would be perverse to withhold provisional assent”; thus the heliocentric theory is proved.
    John E.: Ruhlen will use any comparisons he can get. My point was that the claim that 36 correspondences can’t be coincidental turned out to be absurd, given that 28 of them are coincidental.

  39. Only creationists and historical linguists ever use the term “unproven theory”
    David: is this particular theory of yours proven or unproven ? To avoid identifying yourself as a creationist or (not so bad, in comparison) historical linguist, you would have to say that your theory is proven. But you haven’t proved it. So if it’s true, it’s not provable. Gödel would have been proud of you.

  40. linguist.in.hiding says:

    Thanks Hat!
    > descriptive linguistics of even well-known languages like English is a worthwhile activity
    Why does it always come to this? Very well. Now, I am going to name names and eventually ask you a question.
    Look at the homepage of the MIT Department of Linguistics & Philosophy:
    http://web.mit.edu/linguistics/
    There you see:
    “August 2008: “Meter in Poetry”, by Nigel Fabb (MIT Ph.D. 1984) and Morris Halle was published by Cambridge University Press.”
    Now, isn’t that nice. Maybe this means more money and fame to Morris Halle, and Nigel Fabb. The homepage looks decent. I certainly want to study linguistics, and erm philosophy, there.
    When you have all seen enough of that wonderful place, it must surely take hours and hours, look at the homepage of the Department… Well, I don’t want to spoil it, not yet.
    First look at the homepage of a certain Melissa Axelrod:
    http://www.unm.edu/~axelrod/
    For example: “I have been involved in projects with the Jicarilla Apache community since 1996, in particular, as PI of the NSF-funded Dictionary of Jicarilla Apache project.”
    Seems interesting. The homepage looks so 90s though. There must not be that much money there. I really, really want to study, well whatever, in MIT!
    Now, the big surprise. I swear I didn’t know this beforehand. Look at the homepage of the Department of Linguistics at The University of New Mexico. Melissa works there (apparently at the University of Colorado, too):
    http://www.unm.edu/~linguist/
    “The Department of Linguistics at The University of New Mexico is proud to announce the Joseph H. Greenberg Fellowship” Read all about it there.
    Now, isn’t that nice. Maybe this means more money and fame to someone. I don’t know if you have to be accepted by Merritt Ruhlen. Anyway, it just seems a bit strange.
    Whatever. My question:
    Would you give money to “Meter in Poetry” or to “the Dictionary of Jicarilla Apache”? To Morris Halle or to Melissa Axelrod?

  41. all languages have a common ancestor
    Just to chime in… The hypothesis under discussion is that all spoken languages have a common ancestor. Signed languages have been invented a number of times across history, Nicaraguan Sign Language being a common, recent example.
    The Wikipedia article looks pretty in tune with what I’ve heard in classes. For a little about the typology of signed languages scroll down to the Classification section. There are some interesting links, too…

  42. David Marjanović says:

    Mass comparison is not even a phenetic method, because it gets the characters wrong. For proper linguistic cladistics based on vocabulary, only lexical innovations count as separate characters: mere sound-change does not create a new character. Greenberg’s method groups fruit with wax fruit. (On the phonological side, a single sound-change is too homoplasic to be counted as a character, though a sequence of sound-changes like Grimm’s Law can safely be treated as a character.)

    That’s not what “character” means in cladistics. An innovation is a change between two states of the same character; a character could then be a set of cognates or a phoneme or something.
    Grouping fruit with wax fruit because of their shape and color, when shape and color are counted as more characters than composition, is phenetic.
    “Too homoplastic to be treated as a character” is something that the analysis will tell you after you have treated it as a character. To make such decisions before running the analysis is often a bad idea.
    Also, your example is a matter of degree, not of kind. Grimm’s Law has parallels all over the planet, and the High German consonant shift is being repeated in Scouse (Liverpool English) right now – nothing is immune to convergence.

    thus the heliocentric theory is proved.

    That’s not a theory, it’s a fact. Theories explain facts instead.

    David: is this particular theory of yours proven or unproven ?

    It’s not a theory. It’s a statement of fact, a measurement the precision of which you’re welcome to assess. :-)

  43. David: Excessive levels of hubris detected! Enter the escape pod now!

  44. One useful thing about detailed descriptive work on well-known languages is that it can help you spot gaps in grammars of less well known ones. One of my supervisors told me that whenever he was at a loss for what to write about in Hausa next, he would turn to Pullum et al’s grammar of English and flip through it until he found something he didn’t know how to say in Hausa. He ended up writing a 754-page reference grammar. In writing about Siwi and Kwarandzyey, I’ve found a work on the semantics of focus particles in basically just English and German quite helpful; it gave me a vocabulary to talk about the issues without having to reinvent the wheel. I’d be the first person to argue that we need a well-funded international effort to systematically produce detailed grammars and dictionaries of every language on earth, and that this should be the top priority of linguistics departments just about everywhere. But it’s not an either-or choice.

  45. Okay, I’ll bite on l.i.h.’s question, because I almost but not quite can see the connections, and because it probably looks better when puzzlement or naiveté over these matters is exhibited by a non-linguist. MIT website looks like the linguists have money already, their latest and greatest claim to fame is about English language poetry, probably not cutting edge from a linguist’s point of view, but as far as “more money and fame to Morris Halle, and Nigel Fabb”–Morris Halle is professor emeritis. Now where I come from, when a faculty member retires they are invited to join the Emeritis Club (pronounced em-er-EYE-tus, after the aches and pains retirees are plagued with), the motto of which is “You can put that on your resumé”. Also I see a certain Noam Chomsky is professor emeritus at MIT, maybe that determines the political outline for the whole department. So for some reason either the Greenberg family wasn’t interested in giving money to MIT linguistics (and philosophy) or MIT wasn’t open to receiving money from this particular source. Maybe something about strings attached or reputations.
    So now we have another institution, UNM, that appears to be cash-strapped but is doing some cutting-edge stuff with indigenous groups, and they get a fellowship (leading to “a full-time teaching assistantship”) from the Greenberg family. Maybe it means the department needs the money so badly they don’t care about the controversy, and maybe it means they can be coopted onto a path of non-rigorous methodology. Or maybe it means if you have money you can buy credibility for your theory, or at least a position for someone who will advance the theory from a position of authority. So is that how linguistics/(philosophy) departments work? Not that I ever see anything like that from my viewpoint at the devouree end of the adult education food chain.

  46. This thread is so interesting that I will take the liberty of jumping in, unless anyone objects.
    Grumbly Stu: I will gladly join you on your hunger strike until Marie-Lucie publishes her thoughts on historical linguistics in book form. To show that I mean it, I will herewith propose two observations/amendments to her contributions:
    1-In her first comment on this thread she points to the contrast between the “atomistic” official classification versus the “all-encompassing Amerind”. I think she is doing the official classification a bit of an injustice, inasmuch as some language families of the Americas (Algic, Oto-Mangean) have a time depth comparable to Indo-European.
    2-I think her comment of the shift from focus on morphology to focus on lexical items neglects the fact that the key discovery of ninetheenth-century historical linguistics was the principle of the regularity of sound changes. This regularity is best-preserved in the lexicon: bound morphemes undergo a great many changes which perturb the effects of sound changes.
    Thus, it is indeed true that various Indo-European languages have a copula today which descends from the Indo-European one. But the various forms today have undergone massive analogical reshufflings/transformations which are often poorly understood. Thus it is unclear how Proto-Indo-European *HES-MI yielded Old Latin ESUM, unclear why this yielded Classical Latin SUM (we would expect a form *ERUM): in the transition from Latin to Romance SUM was modified in different ways in different parts of the Empire: French SUIS and Spanish SOY go back to a form *SUJO, Romanian SUNT is due to the influence of the third person plural form, which probably played a role in the genesis of Italian SONO…Let’s face it: reconstructing the Latin copula on the basis of Romance evidence would be difficult, if not impossible.
    Hence I suspect that the focus on vocabulary had more to do with the fact that this focus could yield far more precise results than the comparative study of morphology. The morphology certainly shows the relationship between Indo-European languages better than the lexicon does, but the latter can be reconstructed far more accurately.
    Thank you for your patience. Marie-Lucie, how’s the book coming? You wouldn’t let me and GS starve, now would you?

  47. She’s a kind woman, I’m sure she won’t.

  48. marie-lucie says:

    OK, Etienne, my comments:
    1 – I think she is doing the official classification a bit of an injustice, inasmuch as some language families of the Americas (Algic, Oto-Manguean) have a time depth comparable to Indo-European.
    Those larger language families are few and far between. As one goes West in and across the mountains, there seem to be more isolates (at least, official isolates), and this is not just because mountainous areas provide language “niches” or refuges because of the logistical difficulties of communication.
    About time depth, I am personally uncomfortable with time estimates which cannot be based on some reliable historical or archeological counterpart (this includes the estimates for PIE). I would prefer to say that those families have a degree of internal differentiation comparable to that of IE (if that is in fact the case – I don’t know enough of the details of those families to be more specific).
    2 – I think her comment of the shift from focus on morphology to focus on lexical items neglects the fact that the key discovery of ninetheenth-century historical linguistics was the principle of the regularity of sound changes.
    We were (or at least I was) talking first of all about language classification and the fact that it needs to be done prior to attempting proto-language reconstruction. For language classification, morphological resemblances are a key factor. In IE studies, the principle of the regularity of sound changes came decades later than the morphologically-based realization that the various languages must have a common ancestor and the identification of the major branches of IE. This principle was indeed a key discovery for accurate proto-language reconstruction, but that is a different matter from the prior overall language classification.
    I agree with your last paragraph:
    Hence I suspect that the focus on vocabulary had more to do with the fact that this focus could yield far more precise results than the comparative study of morphology. The morphology certainly shows the relationship between Indo-European languages better than the lexicon does, but the latter can be reconstructed far more accurately.
    but accurate reconstruction had to be based first on the comparative study of morphology. People trying to do comparative work on poorly documented languages (for which no grammar has yet been written, for instance) armed with nothing else than word lists are heading for disaster if they do not take the trouble to look at morphological structures first (there are a number of examples in Native American studies). And morphology is not limited to affixal morphology: one can learn a great deal from non-affixal morphology such as reduplication and internal change, for instance. It is true that when it comes to fine detail, especially in very common forms such as those of the verb ‘to be’, or, for instance, Spanish “vuestra merced” becoming “usted”, there can be irregularities in some of the morphology, but one needs to look at the overall morphology of the languages to be compared. Irregularities can be a clue to the ancient character of some of this morphology.
    John Cowan: m-l: Even without regard to comparative stability, morphological comparison is extremely powerful compared to simple lexical comparison: if we see that a regular plural noun suffix -j in one language corresponds to a regular plural noun suffix -g in another, we have at a single stroke established the correspondence between tens of thousands of words.
    Absolutely! Such morphological correspondences also provide readymade phonological correspondences, especially when these correspondences if found outside of the morphological pattern might seem too strange to be even considered valid. The fact that morphological details apply across the board (to all or most words of the same category, for instance here noun plurals) is also one reason why morphological comparison will often take less time than lexical-phonological comparison, which can stretch indefinitely because of the sheer size of the vocabulary and also the semantic changes which can prevent the identification of cognates which no longer mean the same thing. If the morphologies of two languages which have many words in common are not compatible, then it is most likely that the languages are unrelated (or at least not closely related) but that considerable borrowing has taken place (as in French and English, for instance).
    But when morphological evidence is not available, phonological evidence alone can be sufficient for reconstruction (as in Proto-Tai, say).
    Of course, with languages that have lost a lot of their former morphology (or which give no sign of having ever had much to begin with), one has to fall back on lexical-phonological evidence alone, but that is not a reason to neglect morphological evidence for languages which do (or did) have substantial morphology.

  49. m-l: I did; a thousand apologies.
    Incidentally, on the question of fish: the common ancestor of humans (and whales) and herrings is considerably more recent than that of herrings and sharks: the folk category is more wrong than you might have thought. (Which I take to imply that you might as well include tasty whales in it after all.)

  50. marie-lucie says:

    des, about fish: this is where the analogy gets interesting, because placing whales and fish in the same category or not depends on the level of classification (and therefore the timing of evolution), just like placing two languages under the same classificatory node or not. But in the case of language, the time-depth is much shorter than that of animal evolution, and the “deepest” that has been achieved and can realistically be achieved in the present state of knowledge (eg possible “sisters” to IE) is far from what the Proto-Worldists would like to think.

  51. marie-lucie says:

    GS: I’m going on a hunger strike until such time as she collects all her contributions into a book.
    Etienne: Marie-Lucie, how’s the book coming? You wouldn’t let me and GS starve, now would you?
    LH: She’s a kind woman, I’m sure she won’t.
    You misunderstand me. Actually, I am ruthless. But I’ll let Grumbly and Etienne eat cake.

  52. Actually, I am ruthless.
    Uh-oh! Everybody head for the escape pods!

  53. John Emerson says:

    All carnivores are either dogs or cats, and walruses are dogs. Fact.

  54. But not all carnivores are carnivorans (whales, for example), nor are all carnivorans carnivores (panda bears, for example). And then there’s Cousin Itt.

  55. John Emerson says:

    Whales are fish, as Melville explained. And jackrabbits are not rabbits. And bunny rabbits are poultry.

  56. Continuing JE’s thought, sea lions are dogs while aardwolves are cats. And as far as I am concerned they and we are all fish. I am he as you are he as you are me and we are all together …

  57. That was me, and I meant JE’s walrus thought.
    IIRC, rabbits have cloven hooves only because somebody once mistranslated a hyrax.
    But they and we are all fish. You could look it up.

  58. Terry Collmann says:

    David Marjanović:
    “It is dead obvious that all known life shares a single common ancestor … because all known life shares lots and lots of features”
    Not necessarily: there’s evidence life may have evolved twice on Earth, and there are important details that are NOT shared by bacteria and archaea.

  59. Fish are foxes and whales are hedgehogs. Despite what Descartes may have thought, animals have got souls. That’s why the Romans called them animals.

  60. Fish are foxes and whales are hedgehogs. Despite what Descartes may have thought, animals have got souls. That’s why the Romans called them animals.

  61. John Emerson says:

    No, he was wrong that humans have souls. Otherwise we wouldn’t be allowed to raise them for slaughter.

  62. Marie-Lucie:
    I hope you won’t lose your head because of your attitude. It’s nice that you agree with my point regarding morphology versus lexicon. This is because I have been struck by the tendency on the part of so many researchers to focus on the lexicon. This struck me as especially strange in the Americas, since most of its languages are morphologically so rich. Glad to see an Americanist agree.
    I was intrigued by your distinction between “isolates” and “official isolates”: strictly off the record, which isolates in the Americas do you think are in fact members of a more extended family, as opposed to “genuine” isolates?
    Back to your book: “Vuestra merced” to “usted” is not the best example of irregular change/reduction, since “usted” has actually been argued to be a straighforward loan from Arabic. Perhaps a better example would be the French subject pronouns JE, IL, NOUS, VOUS, ILS, not a single one of which is a phonologically regular reflex of its Latin ancestor.

  63. linguist.in.hiding says:

    > One useful thing about detailed descriptive work on well-known languages is that it can help you spot gaps in grammars of less well known ones. One of my supervisors told me that whenever he was at a loss for what to write about in Hausa next, he would turn to Pullum et al’s grammar of English and flip through it until he found something he didn’t know how to say in Hausa. He ended up writing a 754-page reference grammar. In writing about Siwi and Kwarandzyey, I’ve found a work on the semantics of focus particles in basically just English and German quite helpful; it gave me a vocabulary to talk about the issues without having to reinvent the wheel. I’d be the first person to argue that we need a well-funded international effort to systematically produce detailed grammars and dictionaries of every language on earth, and that this should be the top priority of linguistics departments just about everywhere. But it’s not an either-or choice.
    Fair enough. I thought what I would answer. I thought that I could say something nice about the book Morphology by Peter Matthews (“The examples are drawn from English and other European languages, both ancient and modern”, the well-known languages), yes I know it might not be avantgarde to be basing your views on linguistic issues on non-contemporary theories, I simply like the book and find the view on morphology expressed there quite satisfying. I might point out that many of the well-known languages do not give us that much information on how to interpret some, even important, linguistic phenomena. Here I was thinking especially about how to describe tones. I hope I can remember the details correctly here, it has been more than a decade when I read about this (yes, I could go through my article archive, I know this paper is there, I’m just lazy). There was (?maybe that is so even now in some circles) some dispute whether the tones should be described as a (simplified) contour or (discrete) levels. African languages were important in clearing the issue, not the well-known languages. I might say that the results of reinventing the wheel in linguistics are surely almost never satisfying: I might point to the Dulzonian school that serious linguists that I know secretly or openly abhor (I have also acquainted myself with some such material, and I must say I agree). A little about A. P. Dulzon and mostly about the Chulym Turk language (there is a link to LanguageHat there):
    http://lingsib.iea.ras.ru/en/languages/chulym.shtml
    I could say some more, but what is the point, I generally agree. Instead, I decided to write about why I personally don’t much care about people stressing the importance of linguistic work on well-known languages.
    The future linguistic work on well-known languages is usually already financially backed-up. Does anyone doubt that the linguistic work on English, Russian, Mandarin Chinese, French, German, Japanese, Spanish, Portuguese and so on is in any way threatened? Or will be in the foreseeable future? What about languages with a backing of a small national state, Dutch, Swedish, Cambodian and so on? Major local languages, like Tamil, Swiss German and so on? Of course there is a point where the linguistic work on a language, although somewhat assured, is not really that sure in the future, let us say like in the case of minor local languages, Basque, Sami languages, Tatar… Somewhere along the line there are languages with which the linguistic work must be taken up now, if it will ever be done. Let’s be honest, it is hard to get money for the linguistic work on minority and threatened languages. I know, I have myself tried to get funding for just that with a backing of a M.A. degree and a focus on one specific language. Granted, the language in question was being studied seriously by others, but not by that many. I also hadn’t published anything. This is, of course, a serious error on my part. Anyway, at that time only one other person was actually in a position to compete with me on research money (on working on that specific language), and he didn’t since he already had a secure financial backing. Anyway, after about two years I gave up. No money, no future as a linguist. A dear friend of mine had research funding to work on another, seriously threatened, language for years. During this time he became a leading expert on this language. There was another person on par with him but he died some years ago. In the last ten years he had had almost no funding apart from money in order to partake in a couple of conferences. He had a possibility to be part of a project. As a father of a small child he didn’t do that since they paid peanuts, and they even bargained about that! He has since become a state bureaucrat. This is what some other linguist friends or acquaintances of mine have become: archive caretaker, manager of an ethnological museum, department head of a language institute (this might count as a linguistic job, although it does not count as his line of linguistic work), switched over to literature studies, taken up linguistic caretaking of a minority language backed up by the state in a linguistic institute, railroad worker, become unemployed. Let’s be clear, one or two of these persons has a PhD, and two are PhD students.
    You might take all this as whining, which it is, of course. The world is not fair to me and all that. Anyway, I also consider anyone who stresses the importance of detailed descriptive work on well-known languages as a bully since the continuation of financially backed-up linguistic work on well-known languages is more or less self-evident. Of course, there are other reasons (somewhat dependant on the reasons I mentioned but anyway) to not like the stressing of the importance of detailed descriptive work on well-known languages, and they are surely more relevant, but they are not so personal.

  64. Greenberg actually places great emphasis on morphological similarity and shared irregularity in his essay collection on method, Genetic Linguistics. When comparing all of a continent’s languages, though, he used whatever data he could get on each language, even though this was necessarily poor in many cases. Even if “no more reliable than plain old intuition” I’m glad that someone is willing to systematize and publish it for our curiosity. Is it supposed to be in bad taste to let nonspecialists even know the state of the art in educated guesses on classification?

  65. You might take all this as whining, which it is, of course.
    Well, that’s one way to describe it. I myself would call it a sobering description of a very unfortunate state of affairs. I wish that woman who left $100 million to Poetry had been a fan of the description of unpopular languages instead.

  66. Vajda makes a nice tripartite distinction in his excellent non-technical (well, no more technical than this audience can handle, I think) paper “Dene-Yenise[an] in past and future perspective”: the discovery of a possible language family, the establishing of convincing evidence of a language family, and the creation of a research program in which knowledge about the member languages is improved by referring to their relatives. Historical linguists tend to muddle these together because of the historical accident that Sir William Jones nailed all three at once in his discovery of Indo-European; but from now on, the three will probably be separated in time and space. This paper is also a fun read, what with the -46C room, the dead cockroaches, and the cold-shower exercise program (“Maybe everyone in Siberia was so healthy because everyone else was already dead. I resolved to stay in the first group.”)
    But his fully technical paper establishing Dene-Yenisean has this very beautiful and perceptive paragraph (broken into two here to make it more readable):

    Random similarities in basic vocabulary are insufficient to demonstrate language relatedness. A list of look-alike words can be compiled, even using basic vocabulary, between any human languages. Nor are typological similarities, even involving relatively uncommon traits such as a rigid prefixing verb structure, a reliable diagnostic for genetic relatedness in the absence of a system of cognate morphology. The only accepted way of demonstrating the existence of a language family is to identify a sufficient number of cognates in basic vocabulary to establish interlocking sound correspondences that are reflected in the language’s grammatical systems, as well [...]. All accepted language families share this combination of homologies to an extent that permits at least partial phonological and morphological reconstruction of an ancestral proto-language.
    Though [this is] generally not stressed by historical linguists, true evidence of genetic relationship also provides, by default, external comparative data useful for tracing the internal historical development of each member language or group of languages. Word lists or typological comparisons cannot be used in this way. [...] Haida comparisons have failed to shed any light whatsoever on the historical development of Athabaskan-Eyak-Tlingit, outside the realm of contact phenomena. The same could be said of the still undemonstrated Altaic Hypothesis, which is useless for understanding the internal structure of Modern Halh Mongolian. A Slavic linguist who refuses to accept Indo-European, on the other hand, would be more like a traveler who denies the existence of the automobile. Many facets of Slavic linguistic prehistory simply cannot be fully appraised without acknowledging the demonstrable relationship of Slavic to Baltic, Latin, Iranic, and its other Indo-European relatives. The unavoidable usefulness of a proven genetic connection between languages is the best confirmation of its validity.

  67. Thanks very much, John, that is indeed beautiful and perceptive, and I intend to quote it when the subject comes up.

  68. marie-lucie says:

    caffeind:
    Thank you for the link to Greenberg’s Genetic Linguistics, a work which appears to have been put together by his grateful students (not themselves historical linguists). Quoting from the blurb:
    … This book charts the progress of his subsequent work on language classification in Oceania, the Americas, and Eurasia,in which he proposed the language families Indo-Pacific, Amerind and Eurasiatic.
    These new “families” that he proposed have not been accepted or even taken as plausible and worthy of attention by specialists in the relevant languages. Amerind in particular has been vigorously attacked, with good reason.
    It shows how he established and deployed three fundamental principles:
    … that the most reliable evidence for genetic classification is the pairing of sound and meaning;
    This means studying words, where the pairing of sound and meaning occurs; that is, focusing on comparison of vocabulary. But, as mentioned above, vocabulary has been found to be unreliable as the primary source of comparative material for classification, as it is highly responsive to local conditions such as the natural environment, social upheavals and material and cultural changes. The sound correspondences of words are much more reliable for purposes of subclassification within an established group, and especially for reconstruction of the proto-language, because sounds tend to change across the board (ie an individual sound, if changing, will change in the same way in all the words where it occurs in a given position), but changes in meaning tend to affect individual words in a more random manner, blurring the links between words which are actually related.
    … that nonlinguistic evidence, such as skin colour or cultural traits, should be excluded from the analysis;
    This is a principle which should be obvious, although earlier work on the classification of African languages was sometimes marred by the use of such non-linguistic evidence, so it was good of Greenberg to insist on it.
    … and that the vocabulary and inflections of a very large number of languages should be simultaneously compared:
    This is mass comparison: there is nothing wrong with it in principle, but, at least in his later work (“Amerind” and “Eurasiatic”) (a) Greenberg did a very sloppy job of comparing vocabulary, and (b) the word “inflection” refers to part of the morphology, but he only paid lip service to this principle since he did not deal with it in much detail.
    The principle of mass comparison was formulated in opposition to the technique (common in work on the languages of the Americas) of focusing on shorter lists of frequent words, usually 100 or 200 (including words such as those for sun, water, major body parts, and other basic concepts considered unlikely to be borrowed between languages). As I mentioned earlier, related languages which have much of this basic vocabulary in common (barring a few differences of sounds and of meaning), let’s say over 50%, will also have much of their morphology in common, so that both vocabulary and morphology will correspond. But for languages which are very distantly related, only a few of those words might still be held in common, or be recognizable as such in spite of changes of sound or meaning having occurred since the separation of the languages, and they might be too few to be decisively attributable to relatedness rather than chance or borrowing or yet other non-systematic factors. A larger vocabulary therefore might catch more true correspondences of words. But one should pay much closer attention to the morphology (“inflections”) of those words than to the vocabulary list in order to determine whether the languages might indeed be related. (A word-list will mention both nouns and verbs, but will not give all the necessary information about the modifications undergone by nouns and verbs as they are used in sentences, eg with plural, tense, case, and similar grammatical information).
    I am glad that someone is willing to systematize and publish it for our curiosity.
    But G’s research on Amerind was not properly systematized.
    Is it supposed to be in bad taste to let nonspecialists even know the state of the art in educated guesses on classification?
    Of course not, but it is decidedly in poor taste to present as “state of the art” work with is technically inferior and does not even deserve the name of “educated guesses”, and to sling mud at one’s colleagues while leading nonspecialists to think that they could do a better job than those colleagues (this is on a par with letting your child think s/he beat you at dominoes or scrabble, when in fact you carefully refrained from making winning moves yourself and actually made deliberate losing ones).
    An example of the effect on nonspecialists: Greenberg and Ruhlen’s work on “Amerind” was greeted with some enthusiasm in France (where specialists in the relevant languages are not very many). At one point I looked up the French Google and found many amateurs ready to embark on their own comparative research, with comments such as: “Greenberg finds a root tal meaning [I forgot what it was], so that must be the root of the word Talmud and I am going to look for more cognates of this word” – but of course the root of Talmud is not tal but l-m-d, as any person even minimally familiar with Hebrew or another Semitic language could tell that reader. Amateurs who put forward their own theories of language relationships are encouraged by Ruhlen’s work, not to become familiar with some actual principles and techniques of comparative linguistics, but to go off on their own with absolutely no guidance, either about how to proceed, or how to evaluate competing claims. So the G/R approach does nothing to enlighten the public: just the opposite.

  69. marie-lucie says:

    John C, thank you for quoting Vajda’s papers. I am sure they will become classics. Listening to him speak is a delight. His website (at the U of Western Washington) had pictures of the Kets and their region when I last saw it, and probably stil does.

  70. marie-lucie says:

    Etienne: I have been struck by the tendency on the part of so many researchers to focus on the lexicon. This struck me as especially strange in the Americas, since most of its languages are morphologically so rich.
    They focus on the lexicon because they have been told that classification should be done using “the comparative method”, and that method means finding cognates for reconstructing the common ancestor. After finding cognates, they take a look at the morphology just to make sure. Of course I think that they have got it backwards, and it only works because the languages they are looking at are very closely related. (“They” are a whole slew of Americanist American linguists – but not all).
    With this approach, several proto-languages have been reconstructed (but I am leery of reconstructions performed by just one person), at a very shallow level, for languages which are very obviously related (eg at a level comparable to Proto-Germanic or Proto-Romance, not Proto-Indo-European). One reconstructor declared that relatedness between languages is “either obvious, or forever unknowable”. With this attitude, no one would ever have even imagined Proto-Indo-European.
    I was intrigued by your distinction between “isolates” and “official isolates”: strictly off the record, which isolates in the Americas do you think are in fact members of a more extended family, as opposed to “genuine” isolates?
    Along the Pacific Coast, a number of languages were tentatively grouped by Sapir into the “Penutian phylum”, which has disappeared from modern reference works along with the other “phyla” which were meant as larger possible groups than the originally recognized families, but I think it is likely that most of the members of this hypothesized group, which are currently identified officially as isolates, are actually related. There are some proposals for smaller groups of two or three families within Sapir’s “phylum”.
    “Vuestra merced” to “usted” is not the best example of irregular change/reduction, since “usted” has actually been argued to be a straighforward loan from Arabic.
    This is news to me: reference please? There is at least one Arabist among us: what does he think?

  71. Marie-Lucie:
    The reference: [insert dramatic clash of cymbals here]
    Krotkoff, Georg. “A possible Arabic ingredient in the history of Spanish ‘usted’” ROMANCE PHILOLOGY 17 (1963), pp. 328-332.
    He derives USTED from Arabic USTAAD “master, teacher”. My impression is that the idea has not been as thoroughly explored as it should have been: since the ZEITGEIST is even more hostile today than it was a generation ago to the idea that Arabic influenced European languages this deeply, I somehow doubt the issue will be re-examined anytime soon (even without factoring in the general decline of philology/historical linguistics).

  72. marie-lucie, how are languages recognizable as genetically related – I mean: authentically genetically related – (either as ‘sisters’ or ‘mother’-’daughter(s)’), if not through “cognate” aspects (conjugal or declensional endings, for example)?
    Do you see the gist of this naive question? – There’s an interpretive circularity involving example and pattern that seems to require being entered at a particular point, at some specific (somehow) extant words exemplary of whatever syntactic categories they indicate morphologically. It seems to me that the order of investigation, before it can produce a successful explanation, must hazard a, well, guess (of sorts) as to the inward consistency of some resemblance, which would (in the order of explanation) ‘therefore’ be a “family” resemblance.
    I’m thinking of Ventris, but one could go to Jones, or other convincing ‘demonstrators’ (I don’t say: “provers” – not wanting to fall off a flat earth) of the genetic unity and coherence of language families.
    It seems to me that the great culpability of Greenberg and many of the other listers is not that they do list-n-compare, but rather that they do so with so little, and often so comically-except-when-destructively-convincingly-to-”amateurs” little, rigor.
    Nevertheless, some empirically compelled – but as-yet systematically undemonstrated – pattern, a morphology (if that’s the right word), would have to be hypothesized before it could be demonstrated convincingly. And this compulsion would have to come from phonetic ‘particles’ that are not ‘given’ as exemplary of this or that genetically related morphological feature of the compared languages – because the familial nature of languages is not “given”, but rather is discovered — first by having been ‘guessed’ or intuited or otherwise hazarded.

  73. from Arabic USTAAD “master, teacher”
    That would be أستاذ: “male professor”.

  74. Greenberg published the essays between 1953 and 2000 and discussed the choice of essays for Genetic Linguistics with the editor, William Croft, before his death.
    “Sound and meaning” is explained in chapters 2 and 3 (unfortunately only searchable and not browsable in the Google Books preview) to mean not just phonetic resemblances between isolated words, but relations that also have a morphological and/or syntactic component.

  75. Croft’s publication list also has a bio of Greenberg that Croft wrote for National Academy of Sciences.

  76. أستاذ, originally “teacher, craftsman” (itself from Persian) is phonetically a very plausible source for usted, but to be convinced I would want to see evidence that Andalusis used this title as a generic polite address, or that in old Spanish it was restricted to teachers or something.
    The emphasis on morphology as a proof for reconstruction makes more sense in some families than others. Songhay is a quite transparent family, but too isolating to reconstruct more than a handful of bound morphemes. On top of that, morphological change tends to be rather less regular than vocabulary change in general, and some kinds of bound morphemes are fairly easy to borrow (derivational morphemes and plural markers, especially.) What we need isn’t morphological comparison per se – it’s a clearer picture of what parts of a language are least likely to reflect external influence.

  77. John Emerson says:

    I have met Hajda at one of the few conferences I’ve attended, and can report that he is very pleasant and more than willing to respond to inquiries non-professionals. (It may be that his area of linguistics, Paleo-Siberian basically, is so obsc*re to most people that it’s like some obsc*re c*mic book genre, where all those who care about it at all are one big brotherhood).
    I heard a talk about Nivkh/Gilyak, which is famously complicated and seems to be getting more complicated rather than less. I suggested contrary movements of simplification via pidginization and creolization, and on the other hand complication by isolation, where all speakers of a language are native speakers, as seems to be true of Nivkh. He was more than polite.

  78. John Emerson says:

    I must say that the shibboleths of the otherwise eminently sensible Hat are, to me, entirely unintelligible.

  79. Which ones, John E.?

  80. Trond Engen says:

    Nivkh/Gilyak [...] seems to be getting more complicated rather than less. I suggested contrary movements of simplification via pidginization and creolization, and on the other hand complication by isolation, where all speakers of a language are native speakers, as seems to be true of Nivkh. He was more than polite.
    This seems to be a popular hypothesis these days. In my wording:

    Stable, compact societies have the level of mutual agreement on conventions that allows building of complex morphologies, while dynamic, large societies constantly disagree on details and tend to develop alternative strategies.

    I first met it, at least clearly stated, in Guy Deutscher’s The Unfolding of Language a couple of years ago, and it made immediate sense to me. But I don’t think it’s anywhere near gaining universal acceptance. Nor should it without clear evidence.
    Anyway, the more interesting it is if one now can follow a small language in the process of building an increasingly complex morphology. (That’s also what I imagine might emerge when the dust settles around Everett’s Pirahã.)

  81. Which ones, John E.?
    My MT-Blacklist forced him to write “obsc*re c*mic.” I have now deleted “cure.com” from the list, so this particular absurdity will no longer be a problem.
    *shakes fist at spammers*

  82. Trond Engen says:

    … not to imply that this is in any way cutting edge linguistics. Or not. I have no idea where that edge is. Or was. Or is going to be. I’m a layman with no business in that area, so if I am, it’s purely accidental. I’m probably lost in the mist and about to fall off.

  83. Hmm. Looks like banning “cure\.com” would be the Right Thing. WP is interpreting the period as a regular-expression wildcard meaning “match any character here”, but by preceding it with “\” (if that doesn’t work, try “\\”) should make it match only a literal period.

  84. marie-lucie says:

    L: The emphasis on morphology as a proof for reconstruction makes more sense in some families than others.
    I emphasize morphology as a key element for classification, a different matter from reconstruction, and an indispensable preliminary in cases where some parts of speech (usually nouns and/or verbs) have many forms.
    some kinds of bound morphemes are fairly easy to borrow (derivational morphemes and plural markers, especially.)
    As I wrote above, morphology is not just individual morphemes: Arabic morphology, for instance, cannot be reduced to just a list of affixes. Arabic was spoken in Southern Spain as the politically dominant language for 800 years (far longer than French in England), and quite a number of Arabic words were borrowed into Spanish, but no part of Arabic morphology. In fact, most Arabic nouns were borrowed with their articles attached to them, even though the similarity of Arabic al- and Spanish el could have predicted that nouns would have been borrowed without the article.
    What we need isn’t morphological comparison per se –
    It depends what we need it for.
    … it’s a clearer picture of what parts of a language are least likely to reflect external influence.
    Such a picture cannot be obtained without detailed morphological comparison of what parts are attested to have been borrowed or not in many languages.

  85. John Emerson says:

    One of my conjectures here or elsewhere is that if you had a couple of cycles or first simplification (creolization) and then complication (morphologization), when the creolized language complicates (morphologizes) the second time, it wouldn’t return the earlier morphology but develop a different, possibly very dissimilar one.
    Since IIRC creolized languages often are phonetically different than their parents (are they?) and since you can have changes like SVO –> SOV, it would seem that you could obliterate traces of the original language essentially entirely.

  86. M-L: It is certainly generally said that, as you put it earlier, “morphology is the part of language that is most resistant to change.” But if you know of a study (along the lines of the Loanword Typology Project, perhaps) giving the statistics across a reasonable range of languages for whether morphology really does change faster than (for example) basic vocabulary, I’d love to see it. As far as I know this is just an impressionistic rule of thumb – and, useful as it is in IE or Afro-Asiatic or North America, it isn’t very helpful in dealing with languages that never had much morphology, or that have lost it.
    I’ve always thought it was rather odd how little Arabic morphology entered Spanish. Berber hasn’t been in contact with Arabic for that much longer, and every variety has at the very least Arabic broken plurals (sometimes even extended to the occasional Berber noun); some have Arabic inflections on borrowed adjectives, and one not far from Spain (Ghomara) even has Arabic inflections on borrowed verbs. I suspect religious differences and links with the rest of the Romance world limited the effects of Arabic on Spanish, and (in areas under Spanish rule) encouraged speakers of more Arabised varieties to adjust towards the northerners’ norm.
    JE: That’s a perfect example of a situation where basic vocabulary would be preserved much better than morphology. The orthodox way to get around it is to claim, as do eg Thomason or Bakker, that creoles aren’t really descendants of their “lexifiers”, which strikes me as a rather counterintuitive definition of descent.

  87. During the Reconquista, in addition to conquest of populations who remained in place, northern populations expanded south, southern Christians emigrated north to the Christian-ruled areas or were “rescued” during raids into the Muslim areas (and would have been assimilated piecemeal into the north), and Muslims emigrated south. The last remnant, Granada, was largely Arabic-speaking by the end, and its population was ultimately expelled.
    Besides the religious divide, there is also that Berber is an Afro-Asiatic language that already had more similarity to Arabic than Spanish did, and may have been more permeable to influence to and from Arabic. It is striking that Arabic’s lasting expansion has been almost entirely into areas occupied by other Afro-Asiatic languages (Sudan is the only even partial exception that comes to mind) while strong Arabic-language presences in several Indo-European language areas did not survive.

  88. marie-lucie says:

    L: It is certainly generally said that, as you put it earlier, “morphology is the part of language that is most resistant to change.” But if you know of a study (along the lines of the Loanword Typology Project, perhaps) giving the statistics across a reasonable range of languages for whether morphology really does change faster than (for example) basic vocabulary, I’d love to see it.
    Is there a “not” missing from the second statement (does not)?
    As far as I know this is just an impressionistic rule of thumb – and, useful as it is in IE or Afro-Asiatic or North America, it isn’t very helpful in dealing with languages that never had much morphology, or that have lost it.
    Where did I not acknowledge that? One might as well say that tone is not very important because most languages do not make use of tone. It is important for the languages which do have it, and for comparison between those languages (and sometimes between them and others). Languages which have lost some affixes (eg English) may still preserve older formations, such as the internal vowel changes in some nouns and verbs.
    If you are working, for instance, in comparative Semitic, the fact that the languages have very similar, very distinctive morphology (not limited to affixation) means that you don’t have to waste too much time in trying to prove that the languages are actually related before you can do historical work, and the knowledge of the various types of word formation will save you from pitfalls such as analyzing “talmud” as “tal-mud”. My impression is that linguists working in the long-established families, such as Semitic or Into-European or Algonquian, downplay the role of similar morphology because they can take it for granted in their own field, but where basic classification is still an issue, morphology has to be seriously considered as it provides the basic foundation for the rest of the work. I speak from personal experience in my own comparative work with Amerindian (not “Amerind”) languages.
    The problem with even “basic vocabulary” is that it, too, can change out of recognition, the words can change their meaning, and new words can be used for the same concepts. There was a study from Stanford a few years ago (I forget the author’s name) about the comparative usefulness of short “basic vocabulary” lists, citing quite a number of such lists in different languages. For instance, English and German would appear to be less close than they are because of the evolution in the meaning of actual cognates which are left out for this reason: the lists give “dog” instead of “hound” (the meaning of which is considered too specialized) as the equivalent of “Hund”, and “Kopf” instead of “Haupt” (even more specialized) as the equivalent of “head” (and even if “Haupt” were given, many observers would judge the equivalence dubious because of too much difference in the phonology). There can be no resemblant word for “animal” since “deer” is not the semantic equivalent of “Tier”, etc. On the other hand, basic morphological resemblances (adjective -er, -est; internal changes in nouns and verbs; similar suppletive elements as in good/better, gut/besser, and more) are much better indications of close relationship, because they are not the sort of meaningful elements for which speakers feel a need to be more expressive (for instance, words for “head” are notoriously prone to replacement by various metaphors) but just working parts of the language, the use of which is largely subconscious. Of course, for English and German there is enough other basic vocabulary since the languages are indeed very close, but if only 10% of the basic vocabulary looked similar one might seriously think that the lexical resemblances were due to other causes than relatedness, while the morphological resemblances would be decisive.
    caffeind: Besides the religious divide, there is also that Berber is an Afro-Asiatic language that already had more similarity to Arabic than Spanish did, and may have been more permeable to influence to and from Arabic.
    I entirely agree. It is easy to borrow some morphological elements from languages which have a similar structure to one’s own, and already have a significant number of cognates or resemblant forms.

  89. “faster” > “slower”.
    I agree with that; for comparative work to be taken seriously at all it should reflect as thorough a knowledge of the morphologies of the languages concerned as possible (and all the more so in North America, where so much of what elsewhere would be syntax seems to get morphologised!) And certainly morphology is often a great heuristic for finding relationships. My point is that Vajda’s criterion, “The only accepted way of demonstrating the existence of a language family is to identify a sufficient number of cognates in basic vocabulary to establish interlocking sound correspondences that are reflected in the language’s grammatical systems”, tends to unnecessarily exclude isolating languages.
    Relationship certainly makes borrowing easier, but Nahuatl is about as unrelated to Spanish as you can get, and it’s adopted the Spanish plural suffix. Likewise, some Greek Romani dialects have borrowed Turkish verbs complete with their inflections (and their current speakers don’t know Turkish!) The same goes for similar structure: Istro-Romanian has borrowed both the perfective-imperfective opposition and (some of) the morphemes that express it from Slavic, although this is one of the most obvious structural differences between Slavic and Romance.

  90. marie-lucie says:

    Etienne: “Vuestra merced” to “usted” is not the best example of irregular change/reduction, since “usted” has actually been argued to be a straighforward loan from Arabic.
    Etienne, I found your Krotkoff reference in a footnote to another article which points out that vuestra/vuesa merced (Sancho Panza uses the second form, which was also reduced to the alternate form vuced, which is also the source of Portuguese vocè) was only one of a number of common honorific formulae which also had developed reduced forms, such as vues(tr)a excelencia – vuecencia – ucencia, vues(tr)a señoría – vusía >usía. This suggests to me that usted might have been at best influenced by the Arabic word “ustaad”, glossed (probably in the medieval Spanish context) as ‘persona de gran estima social’, but that it is not derived from it (from the comments, and the title of Krotkoff’s own article, it seems that K also thought that the Arabic word was “a possible ingredient” in the evolution of the Spanish honorific, rather than “a straightforward loan from Arabic”).

  91. John Emerson says:

    L: Yes, that’s why I am moderately sympathetic to Ruhlan’s and Greenberg’s methods.

  92. marie-lucie says:

    JE, you are looking at R and G’s methods in the abstract, you have not (I suppose) looked through their books and found gross errors in languages that you yourself know or trust some of your friends and colleagues to know, in addition to general statements that a serious comparative/historical linguist could not possibly agree with. By “serious” I don’t necessarily mean “blindly following common presuppositions in the discipline”.

  93. marie-lucie says:

    L: in North America, where so much of what elsewhere would be syntax seems to get morphologised!
    This is true of some languages, but by no means all.

  94. JE is sympathetic to anyone who sticks it to The Man, academically speaking. I am sympathetic to his sympathy, but draw the line at people who show no concern for facts.

  95. John Emerson says:

    He only does it to annoy, Because he knows it teases.

  96. marie-lucie says:

    I know, I know! but I like to “dot the i’s”.

  97. linguist.in.hiding says:

    > My impression is that linguists working in the long-established families, such as Semitic or Into-European or Algonquian, downplay the role of similar morphology because they can take it for granted in their own field, but where basic classification is still an issue, morphology has to be seriously considered as it provides the basic foundation for the rest of the work. I speak from personal experience in my own comparative work with Amerindian (not “Amerind”) languages.
    That may be for Amerindian. My personal experience is that linguists working with Indo-European, Uralic or Semitic diachrony stress the fact that morphology is important (if it exists, of course), even if they take it for granted. They stress the role of inflectional, conjugational and, sometimes even, derivational “formatives”, or morphological paradigms. They downplay the role of the lexicon, can’t say that I disagree. Morphology is important in diachrony even with languages nowadays virtually empty of it. Here I was thinking about Cambodian, I think they have reconstructed something of the former prefix system with what little “survived”. I don’t know but that may have played a role in establishing what are now considered the Mon-Khmer languages. I’m sure that you all know that all this also has a downside. Namely that the reconstructed morphological (or the larger grammatical) system is quite systematical and rigid. Funny exceptions are not (cannot be) reconstructed, and this is, of course, the curse of the method. There is no way to compensate for this (no, relying blindly on the lexicon like Greenberg is not the solution). That is (one reason) why the results are called reconstructions, not the real thing.
    > On the other hand, basic morphological resemblances (adjective -er, -est; internal changes in nouns and verbs; similar suppletive elements as in good/better, gut/besser, and more) are much better indications of close relationship, because they are not the sort of meaningful elements for which speakers feel a need to be more expressive (for instance, words for “head” are notoriously prone to replacement by various metaphors) but just working parts of the language, the use of which is largely subconscious.
    This was actually said to us students (and it was clear in what I have read about other disciplines). Maybe they didn’t stress the psychological aspect so much, after all we were students of linguistics, not psychology: during my student years and afterwards in the linguistic disciplines which I have read serious articles on I have seen a decline of (and a growing distaste for) speculative connections involving other disciplines.

    One thing to add.
    On the role of morphology (or, anything linguistic) vrs. lexicon in a wider context, consider:
    http://itre.cis.upenn.edu/~myl/languagelog/archives/000256.html
    “Watson, of course, like just about every non-linguist who ever writes about language, presupposes that a language is just a big bag of words.”
    _I_ have considered just altogether abandoning, or at least paying no attention to, lexicology in my dark moments, “No, don’t talk to me about words! I’m sick of them!”. Anyway, the quote is telling.

  98. So many things to respond to, my apologies for the length…
    1)Marie-Lucie: despite the title, Krotkoff, as I recall, saw “usted” as a loanword pure and simple (the shift of Arabic long /a/ into Spanish /e/ he accounted for quite nicely). I was just an undergraduate when I read him, and he made (to my mind) a good case. I strongly recommend you take a look at the article itself: to repeat myself, I don’t think the thesis has been properly evaluated.
    2) John Emerson: you are quite right that cycles of pidginization/creolization completely eliminate inherited morphology, meaning that later morphology, created from originally free morphemes, does not correspond to the morphology found in pre-pidgin stages of the language.
    In Tok Pisin, for example, the only inflection the language originally had was a verbal ending -IM, (English HIM), used to indicate transitivity. Today, however, along with some borrowed English morphemes (-ING, -S), new morphological oppositions are being created language-internally: thus, the third person singular pronoun EM (also from English HIM) originally did not mark case, but today a variant, EN, is possible when it is the object of a preposition: morphology is thus being created, but the system (nominative EM, ‘prepositional’ EN) is quite alien to English (but the lexicon, on the other hand, is mostly English in origin).
    Caffiend: actually, in the case of North Africa it has been argued that the earlier presence of Punic made shift to Arabic much easier, so that you could replace your “Afro-Asiatic languages” by “Semitic languages”.
    All: one complicating factor that needs to be remembered when the borrowing of morphology is discussed is this: free morphemes are much more readily borrowed than bound morphemes. And free morphemes can turn into bound morphemes.
    Consider the “debate” surrounding Amerind: Greenberg claimed that the widely found bound pronominal forms (first person N-, second person M-) must have been inherited, because person-marking bound morphemes are seldom if ever borrowed. Yet nobody seems to have asked the obvious question: how do we know that these elements were bound morphemes when they were borrowed?
    Let’s imagine that shared “Amerind” features are remnants of a prehistoric “Sprachbund”, which included the diffusion of free pronouns *nV for first, and *mV for second person (Free/emphatic pronouns are much more readily borrowed than bound ones: various dialects of Malay, for example, have borrowed free pronouns from different sources, but I believe all make use of inherited Austronesian bound person-marking morphology only. And whereas monolingual English speakers occasionally use French “MOI”, even English-French bilinguals do not usually use French bound morphemes in their English).
    Couldn’t these originally free elements have grammaticized in various different languages once members of this hypothesized SPRACHBUND? Thereby giving the illusion that they are genetically related and that their (non-existent) common ancestor had bound first person N- and second-person M-…
    Just a thought.

  99. linguist.in.hiding says:

    Oh no!
    I wish I never wrote what I did! I assume the mass-comparativists are as much collectionaires as other people. If this is true they may put 2+2 together. I mean:
    > I’m sure that you all know that all this also has a downside. Namely that the reconstructed morphological (or the larger grammatical) system is quite systematical and rigid. Funny exceptions are not (cannot be) reconstructed, and this is, of course, the curse of the method. There is no way to compensate for this (no, relying blindly on the lexicon like Greenberg is not the solution). That is (one reason) why the results are called reconstructions, not the real thing.
    and
    > 2) John Emerson: you are quite right that cycles of pidginization/creolization completely eliminate inherited morphology, meaning that later morphology, created from originally free morphemes, does not correspond to the morphology found in pre-pidgin stages of the language.
    So, morphology is entirely irrelevant because all languages are just pidgins/creoles. They go through pidginization/creolization. Nothing of morphology survives anyway.
    So, trust blindly in the lexicon! Or, just make things up!

  100. >you could replace your “Afro-Asiatic languages” by “Semitic languages”
    Don’t forget Egyptian!
    >how do we know that these elements were bound morphemes when they were borrowed?
    I don’t think he’s arguing that they were necessarily originally bound, since he gives examples of them showing up in various positions relative to verbs and nouns. He does argue that N/M are dispersed around the Americas so that diffusion at a later date is unlikely and inheritance is the most parsimonious explanation. This does not exclude your Sprachbund scenario but would push it back towards the date of settlement.

  101. marie-lucie says:

    Etienne: despite the title, Krotkoff, as I recall, saw “usted” as a loanword pure and simple (the shift of Arabic long /a/ into Spanish /e/ he accounted for quite nicely). … . I strongly recommend you take a look at the article itself…
    I suspected that the title was probably a little more reticent than the author’s conclusion, but I will try more Google pages if I can’t find the journal itself.
    Consider the “debate” surrounding Amerind: Greenberg claimed that the widely found bound pronominal forms (first person N-, second person M-) must have been inherited, because person-marking bound morphemes are seldom if ever borrowed. Yet nobody seems to have asked the obvious question: how do we know that these elements were bound morphemes when they were borrowed?
    Nobody has asked “the obvious question” because the people who (unlike G) actually knew some of the relevant languages could see that G’s methods and conclusions could not be taken seriously.
    First, these pronominal forms are widely found but not at all universal in “Amerind”: some languages have them with the opposite or different meanings (eg M for 3rd person), some have only one of the two, not always with the relevant meaning, some have quite different consonants, such as K or T for 1st person.
    Second, they are not always bound (even to a pronominal base rather than just the verb). Even if they were always bound in the documented languages, it would be much more likely (as you say) that they had been free originally, like Latin “ego”, an independent word, becoming “je” in French, a loosely bound form (a clitic) since it can never be uttered in isolation in normal speech (as opposed to a linguistic discussion).
    Third, where these pronouns are bound, they sometimes appear as prefixes, other times as suffixes (which confirms the likelihood that they were not bound to begin with but became grammaticized in a given position according to the structure of individual languages). It is common in many languages to have different forms for subject and object (as in English he and him, or I and me), and in some “Amerind” languages I know there is preverbal M for 2nd person subject and suffixed -N for 2nd person object. Etcetera, etcetera. Greenberg’s supposed major clue falls apart when one looks at what actually happens in the languages.

  102. Yes, neither the morphology nor the basic vocabulary can be relied upon to be passed on forever. That would be one of the biggest reasons why the odds of reconstructing any proto-World are so small.
    But no, not all languages are pidgins/creoles (I do realise that was probably intentional overstatement.) For a creole to emerge you need quite a lot of people learning something as a second language but not very deeply, and adopting it as their community language. That’s not all that common even in recent centuries, and should be significantly less common in sparsely populated areas with no large-scale political structures – which describes most of human history.
    Incidentally, you can reconstruct some funny exceptions – any reconstruction of proto-Romance is going to have “to be” as a very irregular verb, although it would probably omit the irregularities of Latin “to want”. Of course internal reconstruction may end up ironing out exceptions that are in fact older than your proto-language, but that’s just another case of proto-languages’ tendency to be a bit temporally blurry.

  103. marie-lucie says:

    linguist-in-hiding: about the role of morphology:
    My personal experience is that linguists working with Indo-European, Uralic or Semitic diachrony stress the fact that morphology is important (if it exists, of course), even if they take it for granted. They stress the role of inflectional, conjugational and, sometimes even, derivational “formatives”, or morphological paradigms. They downplay the role of the lexicon, …
    On the other hand, consider these quotes from the first chapter of Philip Baldi’s The Foundations of Latin, a work of comparative/historical linguistics, regarding the comparative method:
    Work done on the IE languages during the nineteenth century continues to provide a methodological focal point for the conduct of studies of language relationships…. The models of description and the associated theoretical paradigms … contain at least the following central notions …
    (1) A significant percentage of cognates (words with a common origin) must be demonstrated in order to establish genetic affinity…
    Other principles deal mostly with phonological comparison (eg the regularity of sound-changes, exceptions, etc), but there is no mention of shared morphology until principle (5), which deals with the value of irregularities such as the forms of the verb to be. A couple of pages later the author goes on:
    In determining genetic relationship and reconstructing proto-forms using the comparative method, it is customary to start with vocabularly (especially “basic”). He gives three tables and more pages illustrating the comparative method, all of which deal with words, not with paradigms.
    Reading these descriptions from an acknowledged specialist, it is difficult not to conclude that the importance of the morphological foundation for comparative work is forgotten or at least taken for granted.

  104. linguist.in.hiding says:

    > But no, not all languages are pidgins/creoles (I do realise that was probably intentional overstatement.)
    Of course.
    > Incidentally, you can reconstruct some funny exceptions – any reconstruction of proto-Romance is going to have “to be” as a very irregular verb, although it would probably omit the irregularities of Latin “to want”.
    But I didn’t intend that as a funny exception. That is a normal exception (and, well, essential). One might say that it is no wonder that you have deviant patterns in common(er) “words”. It is the deviant patterns in rare(r) (well no, rare and “quite old” is a bit better; no word is old or young nor any language, and I feel ashamed to point this out here but there you go) “words” that usually cannot be reconstructed. All this is a bit relative, since there is cultural (or whatever…) interference there (some people speaking different languages keep other “words” since they are culturally important, while others don’t, and so on)… Unfortunately we don’t have many concordances of times past… I’m just struggling to make a bifurgative dividing line here.
    > Of course internal reconstruction may end up ironing out exceptions that are in fact older than your proto-language, but that’s just another case of proto-languages’ tendency to be a bit temporally blurry.
    “may end up”? “older than your proto-language”? Erm… You are acquainted with the internal weaknesses of the comparative method? And not “weaknesses” like those proposed by Nostracists or whoever… Why only internal reconstruction (or is this an incidence with different senses of the “same” concept here)? This happens with reconstruction with any related languages. But yes, just another case. But is it really a goal to be temporal? Mine would be relational. If that coincides with temporal, great.

  105. AMERICANISTS IN DISPUTE; Lively Controversy Over Coining the Word “Amerind.” October 22, 1902
    A long dispute, which at times was somewhat heated and acrimonious, was engaged in by members of the International Congress of Americanists at the meeting in the American Museum of Natural History yesterday over the use by one of the speakers of the word “Amerind,” to designate collectively all of the Indians who live or once lived in the Western Hemisphere.

  106. marie-lucie says:

    So that is when “Amerindian” comes from! But Greenberg’s “Amerind” must be a back-formation from “Amerindian” and does not go back to the 1902 suggestion.

  107. Amerind has been in continuous use although not the most popular term. The 1902 article says it was already in use by 1/3 to 1/2 of professionals. It may well have declined in popularity later, but Greenberg was already active by mid-century.
    http://en.wikipedia.org/wiki/Amerind_(people) makes no reference to Greenberg or language at all. It references the 1902 article, though I found that directly from Google, not Wikipedia.
    Google Scholar search for Amerind prior to 1987 when Greenberg published Language in the Americas gives 2,220 results, few of them about language.

  108. William Shirley Fulton, an archaeologist, established the Amerind Foundation in 1937.

  109. marie-lucie says:

    Amateur linguistics: the post just before the Oggins one (Nov 11 2008, mentioned in yesterday’s Maar on Nabokov) deals with this very subject.

  110. John Emerson says:

    there’s a term for the simplification of a language resulting from use by non-native speakers: “creolization” (actually, I suppose it’s pidginization followed by creolization as the pifgin becomes a native language). If it is true that some languages, possibly starting from a creolized language, complicate themselves after long periods spoken only or mostly by native speakers, there should be a word for that process too.
    As I remember, glottochronology foudnered on the facts that languages don’t diverge much if not separated (e.g., Portuguese and Spanish are about as near/far as they ever were), whereas pidgin–>creoles can pack a thousand years of “regular” language change into a few decades.

  111. marie-lucie says:

    Glottochronology: here is another method that deals exclusively with vocabulary, with all the potential sources of error inherent in that source. It is no longer used much by historical linguists, but modern versions include, for instance, DNA studies, even though population movements and language shift (eg by immigrants to other countries) complicate the problem of linking ethnic groups and language groups.
    pidgin–>creoles can pack a thousand years of “regular” language change into a few decades
    This refers not so much to vocabulary but to the increasing complexity of morphology and syntax. Also, relative to the original vocabulary (of the previously dominant language), there is usually simplification of the phonology as well (a common result of language shift by adult learners), in ways that can mimic changes which in a linguistically stable population might take centuries.

  112. Caffiend: yes, I had overlooked Egyptian/Coptic. Mea culpa.
    Marie-Lucie: I believe Johnanna Nichols (Nichols & Peterson 1996) showed that the N- “first person”/ M- “second person” was indeed unusually common in parts of the Americas (Mesoamerica, Western North and Western South America). Which makes a diffusion explanation likelier than a genetic one, it seems to me.
    linguist.in.hiding: actually, if we ever found two or more genetically related languages whose respective morphologies could not be traced back to their Proto-language, but whose respective lexica could, then the simplest explanation would indeed be that the Proto-language was in fact an isolating language. Interestingly, to my knowledge no such language family is known to exist.
    John Emerson: the process whereby a creole becomes progressively less creole-like over time, through the influence of other languages as well as through language-internal innovations, is often (informally) referred to as “complexification”: “decreolization” is often used with the same meaning, but the latter term also can refer to the creole becoming ever-more influenced by, and thus growing ever-closer to, its source language (sometimes ending in language death).

  113. marie-lucie says:

    Etienne: I believe Johnanna Nichols (Nichols & Peterson 1996) showed that the N- “first person”/ M- “second person” was indeed unusually common in parts of the Americas (Mesoamerica, Western North and Western South America). Which makes a diffusion explanation likelier than a genetic one, it seems to me.
    As you note, this pattern is common in parts of the Americas, not in the entire continent as Greenberg claimed. If the languages in question are unrelated, diffusion is a likely explanation, but there is still a possibility that the languages might be related (even though they cannot be equated with “Amerind”), although one could not jump to this conclusion on the basis of those two pronouns alone.
    To complicate the problem of the reliability of pronouns for comparative purposes, Johanna Nichols also found the N/M pattern in Austronesian, for instance. And the Proto-Worldists also point out that first person N (which is more widespread than N/M together) also exists in Basque and in (some) Caucasian languages.
    if we ever found two or more genetically related languages whose respective morphologies could not be traced back to their Proto-language, but whose respective lexica could,…
    Then we could not definitely say that they were genetically related. Instead, creolization might be the answer: similar vocabulary with very different morphology.

  114. the process whereby a creole becomes progressively less creole-like over time, through the influence of other languages as well as through language-internal innovations, is often (informally) referred to as “complexification”
    Etienne, I’m just wondering: are creoles often (informally) regarded as simple-minded ? Do the self-regarding persons in question (linguists ??), i.e. those who take a minute to regard themselves, often (informally) regard themselves as highly sophisticated, in contrast to the creoles they study ?
    Or is there some world-view at work here, according to which absolutely everything tends to complexification over time – from eggplants to societies ? Hmmm … I wonder where the entropy goes ?
    Why should “other languages” and language-internal innovation be primary factors in such “decreolization” ? Don’t you think that larger-scale complexification in the environment (society) would have to accompany such change (not necessarily preceding it) ? If the “other languages” were themselves creoles, what could one imagine as the reasons for complexification of each creole within itself, with respect to the others ? Are non-creole languages stronger than creoles, in some sense ? Is this complexification business essentially a linguistic phenomenon, or perhaps more a matter of political and economic predominance ?
    It does seem to me that you are postulating a tendency to complexification tout court. It just happens, like Topsy jes’ grows. But then I don’t see why creoles should be particularly liable to it.

  115. Creoles by nature start out as simplified languages, being derived from pidgins, so it is natural for them to develop greater complexity over time.

  116. Are there no examples of languages becoming “simpler” over time: losing syntactic distinctions, morphological characteristics etc (assuming that these things are examples of what you are calling “complex”)? Are you saying that some languages go from “simple” to “complex” (creoles, for instance), while others go from “complex” to “simple” ? How could one account for the origins of a language that was “complex” in the past ?
    Is the distinction simple/complex useful in any non-subjective way, or it is just one that people are accustomed to deploying ?

  117. John Emerson says:

    By what’s been said above, complex societies (or very large societies, e.g. empires) often have less-complex languages, whereas simpler (or more local and less extensive) societies have more complex languages. The pidginization / creolization processes involved in teaching the language to non-native speakers is the reason (if this theory is true).
    I used to have a few books on pidgins and creoles, which I sold in 1982 and still miss to this day, and I noticed several traits of Chinese, including and maybe especially classical Chinese, which were pidgin-like or creole-like. Lack of inflection, word-order governing the grammatical interpretation of words, absence of articles, and the use of verbs as prepositions. (For example, a Chinese student in English once said “Let me with you” for “Let me accompany you”. The Chinese word for “with”, “gen”, can be used as a verb that way.)
    Complexity doesn’t mean the ability to expr*ss c*mplex ideas at all, but just certain kinds of structural complications and irregularities, most notably (if you ask me) the accursed and loathsome German noun declensions.
    Some educated Chinese fluent in English still sometimes revert to Chinese grammar and produce a seeming pidgin. In my opinion they do this because they think that English grammar is stupid and useless, and I pretty much agree with them. (But even more so on the German noun declensions).

  118. expr*ss c*mplex
    Sigh. I’ve removed “express.com” from the blacklist.

  119. Hey, JE, lay off German noun declensions !! Nothing could be easier – among things teutonophone, that is. There are a bunch of “irregularities” you have to learn by rote, but then English itself has enough of those – and they haven’t pizened your personality at all, right ?
    What I would have expected you to be griping about is “gender”. But I would suggest that gender is always a bitch, in every walk of life.

  120. No offense to any genders that may be skulking in the shadows, and that I haven’t noticed.

  121. marie-lucie says:

    Gumbly: there is no “optimum” type of language, but there are tendencies which seem to compensate each other, so that simplicity in one area tends to foster complexity in another. For instance, the English verb lost some morphological characteristics in centuries past and did not develop new ones, but instead it developed “periphrastic” forms using several words (eg I would be talking where French would use Je parlerais). For another example, Polynesian languages such as Hawaiian tend to have very few consonants and vowels, but they also tend to have very long words, while English has lots of consonants and vowels and can get away with “four-letter words” (the typical one-syllable words, such as back or hand).
    Social factors are very important in studying language evolution. One well-known tendency is that in a culturally important centre (often a capital city) which attracts people from all over the place, interaction between those diverse people tends to even out speech differences, most obviously in pronunciation, but also in vocabulary and syntax, since people have to make sure others understand them and therefore avoid the characteristics of their own speech which prevent communication with their new neighbours as well as mark them as outsiders. The result is often a simplification: for instance, over my lifetime the vowels of Standard French (as described in some newer works) have been reduced in number because of the loss of some earlier distinctions which were not made in all parts of France, so they melted away in the Parisian melting-pot.
    On the other hand, where simplification results in the obliteration of differences which are functionally important, new complexity often arises, in a different way. An example of new complexity following simplification is the loss of
    thou, thee in Standard English as you became virtually universal in the singular, but in many places the ambiguity of you has spawned yous, you all, you guys, you people, etc. None of these unambiguously plural versions is currently standard, but it is quite likely that one of these (if not yet another one) will eventually be accepted. Fast forward a few centuries, and it is also likely that a two-word sequence will have coalesced into one, like “y’all” already did for you all.
    A pidgin is a simplified language used as an additional language for communicating in a limited way with members of one or more different speech communities: usually for trading, or following orders, but not for having long involved discussions. When circumstances arise such that the pidgin becomes the language of a whole community, including all ages (as in Tok Pisin or Haitian creole), the pidgin typically becomes both simpler (eg collapsing two words in one) and more complex, as adult speakers have to find ways to express everything they were used to expressing in their own languages before, and children, being much more creative than the adults, and also unfettered by an earlier language that they themselves have never learned, also invent conventional ways of expression with whatever resources the pidgin offers.

  122. David Marjanović says:

    there’s evidence life may have evolved twice on Earth, and there are important details that are NOT shared by bacteria and archaea.

    The New Scientist article you link to fails to provide any evidence for this except this; it only gives more detail to how we should imagine LUCA*. New Scientist isn’t trustworthy, it tries to makes everything as newsworthy as possible and commonly overdoes it. I mean, an article ending with

    Many details have yet to be filled in, and it may never be possible to prove beyond any doubt that life evolved by this mechanism. The evidence, however, is growing. This scenario matches the known properties of all life on Earth, is energetically plausible – and returns Mitchell’s great theory to its rightful place at the very centre of biology.

    isn’t science. It’s sickening!
    The link I cite asks if DNA replication evolved twice independently. Well, there’s the idea that LUCA used RNA instead of DNA, and DNA was introduced to cells twice independently by two different DNA viruses. But if that’s true, that changes a lot less than one might imagine.
    * The Last Universal Common Ancestor.

    Hey, JE, lay off German noun declensions !! Nothing could be easier -

    They’re mostly outsourced to the article anyway. Adjective declension, now there’s something for you!!!
    der gute
    ein guter
    guter

    dem guten
    einem guten
    gutem

    It takes into account gender, number and case, as well as the presence of an article that takes some of the case-marking load off. And then most of the endings are -n and -m, which often assimilate (to varying degrees) to the consonant at the start of the next word (if there is one). There are examples that regularly throw native speakers into confusion.

  123. Marie-Lucie: If we found two genetically related languages whose vocabularies can be traced back to a proto-language, but whose morphologies can’t, creolization would indeed be a possible explanation.
    We could assume one of the two languages to have been creolized (and hence lost the morphology of its ancestor) and to have subsequently created new morphology. Or perhaps both were, with neither preserving any of the morphology of their common ancestor. Or perhaps the proto-language was an isolating language (and might or might not owe said isolating structure to creolization), and both languages simply created diferent morphologies out of inherited (and, in both daughter languages, different) free morphemes.
    Of course, another possibility would be an isolating proto-language whose daughter languages grammaticized the same set of inherited free morphemes, thereby making it seem that the proto-language had bound morphemes. Some scholars have claimed that this is the case with Early Indo-European, and indeed there are tantalizing clues that Early Indo-European may have been a pidgin (I can list a couple of those clues if anyone is interested. Kind of an optimistic take –I’m assuming there are still readers…).
    Grumbly Stu: the key to understanding why creoles, unlike other languages, are genuinely simplified languages is their pidgin past. Pidgins are created by adults for communication: and adults, unlike children, are terrible language learners. Once a pidgin has become a creole, this creole, being learned generation after generation by children, can keep gaining more and more complexity (language-internally, through contact, or both), which subsequent generations of children will acquire, just in the same way that children acquire the complexities of non-creole languages. Given enough time a creole will become just as complex as any other language.
    But short of pidginization, there is not much of a reason for linguistic complexity to be systematically eliminated (children being such good language learners). Some things are created, some are lost: some of the former are complex, as are some of the latter. But again, because children can acquire complexity (without conscious effort), there is no diachronic trend whereby more complex structures are more liable to loss than simpler ones.
    Hence such things as Arabic broken plurals, English strong verbs, Russian mobile stress, the three genders of Modern Greek…all of these linguistic complexities are thousands of years old (in the case of the last three languages they all go back to late Indo-European itself).
    (Apologies for the length, again, but this time I’ve an excuse: I’m working -with a colleague- on a book on diachronic creole linguistics, so this subject is close to my heart).

  124. David Marjanović says:

    But if that’s true, that changes a lot less than one might imagine.

    It’s a bit like papyrus vs parchment, I forgot to add.

    An example of new complexity following simplification is

    Another, presumably the textbook example, is the Western Romance future tense, composed of the infinitive (with its ending) and a suffix that once was the fully conjugated present tense of “to have”. The Classical Latin future similarly consisted of single words, but they were constructed in totally different ways and have, AFAIK, left no trace in the modern languages.

  125. I never found the German adjectival declensions terribly difficult. When I was learning them I simply followed three general rules: (1) cases range from the lightly inflected to the heavily inflected (nominative – accusative – dative and genitive), (2) genders range from the easily inflected to the more resistant (masculine – neuter – feminine) (3) if the article shows the inflection clearly, the adjective doesn’t need to (der gute, einer gute). It helped me remember them without too much effort, and even now I could still probably get it right most of the time.

  126. David Marjanović says:

    there are tantalizing clues that Early Indo-European may have been a pidgin (I can list a couple of those clues if anyone is interested.

    Yes, please!

  127. Sorry, that should have been ein guter.

  128. David Marjanović says:

    The mention of broken plurals reminds me of the greatest blog post headline ever.

    einer gute

    Wrong. So wrong, in fact, that I can’t even tell what you were aiming at. :-)

  129. Nichols and Peterson’s dataset apparently consisted of 230 languages worldwide (map) which would be a much smaller number of American languages than Greenberg looked at. Even so, they show clusters in western North America, southern Mesoamerica, and southern South America. In Language Diversity in Space and Time, Nichols arbitrarily selected one language per stock as a representative, so maybe that is what is happening here. 25 languages of their sample of 230 have paradigmatic N-M pronouns; one is in Africa, one in New Guinea, and the rest in the Americas.

  130. marie-lucie says:

    caffeind: Nichols and Peterson’s dataset apparently consisted of 230 languages worldwide (map) which would be a much smaller number of American languages than Greenberg looked at. … Nichols arbitrarily selected one language per stock as a representative…
    Nichols tried to be representative in her work, or at least to spread her sample more or less evenly over the world. Greenberg just took pretty much what was available in the Stanford library, so for some “families” he quotes several very closely related languages, for some others only one or none (and the data are not equally reliable for all). This skews the data, since languages are disproportionately represented. It is as if one took as representative of European languages, for instance: 3 Basque dialects, Portuguese, 4 or 5 Italian dialects, Irish, Czech, Russian, Belorussian and Ukrainian, plus Greek and Maltese, and then proclaimed a single “Europoid” family with a common ancestor.
    Back to Nichols, “one language per stock” in the Americas would be a good idea, but it turns out to be misleading also (though not her fault), since it follows the current “mainstream” classification, but at least along the West Coast some families currently classified as isolates will most probably be grouped together into a “stock”, so the number of “stocks” currently recognized is is much greater than it will eventually turn out to be. The regional clusters with the N/M pattern are indeed in the Americas, but they are not scattered randomly throughout the continent.

  131. Reading more about Nichols and Peterson’s position:
    N&P: n:m is not enough to prove or strongly suggest genetic relatedness, but its distribution can hardly be due to universals or chance.
    N: m-T and n-m clusters must each result from some historical event, connection, relationship, etc. We can’t determine what that historical situation was: descent? areality? spread of a sound-symbolic canon?
    Non-universality of the N-M pattern in the Americas doesn’t seem a strong counterargument – we would expect some languages to evolve away from it over time. On the other hand, its presence up and down the Americas seems to indicate some connection, there is no evidence for either mass migration or elite-dominant migration between North and South America, and a small migration would predominate only on initial settlement.

  132. Wrong. So wrong, in fact, that I can’t even tell what you were aiming at. :-)
    Yes, and I’m not sure what I thought I was copying from when I typed that :) I can only plea that I have a bad cold, if that makes any difference.

  133. It is as if one took as representative of European languages, for instance: 3 Basque dialects… and then proclaimed a single “Europoid” family with a common ancestor.
    Greenberg brings up exactly this scenario in three of the essays in Genetic Linguistics (p. 40, 283, 374), saying Basque will pop out as different after examining as few as three words across all languages.

  134. marie-lucie says:

    caffeind: Non-universality of the N-M pattern in the Americas doesn’t seem a strong counterargument…
    … to the universality of Amerind, you mean.
    we would expect some languages to evolve away from it over time.
    But why? if the pattern could be abandoned, it could also be borrowed by other languages (unlikely as both these scenarios seem). If they (barring of course the two non-Amerind families, which are acknowledged to be more recent on the continent) were all “Amerind”, they would all have started with those pronouns, and therefore they would all have been surrounded by other languages with the same pronouns. Pronouns may not be immune to borrowing, but they are at the low end of the probability scale for borrowing (individual nouns being at the top), so borrowing pronouns would entail borrowing a lot of lexical items.
    On the other hand, its presence up and down the Americas seems to indicate some connection, there is no evidence for either mass migration or elite-dominant migration between North and South America, and a small migration would predominate only on initial settlement.
    There is no denying that there must be “some connection”, whatever its nature, as Nichols says, but the prehistory of the Americas is not very well-known, especially the distinct possibility of several migrations into the continent, not just one settlement occurring before the opening of the Bering Strait around the end of the last Ice Age.
    The large diversity of languages in the Americas is a problem to be explained: there are too many families for the size of the continent compared to other areas, and too little time for such differentiation if there was only one initial migration of a small homogeneous language group about 10 to 12,000 years ago, as Nichols has pointed out. This problem is not solved by lumping together languages and groups which are very different from each other (as Greenberg did, using inaccurate methods), nor by assuming that if a relationship is not “obvious, it is forever unknowable” and discouraging efforts at finding deeper relationships (as some other linguists do, by their actions if not by their words).
    Basque will pop out as different after examining as few as three words across all languages.
    It depends which three words you choose. Basque has adopted many Spanish words, so three words picked at random from a dictionary might well be from Spanish, or at least include one Spanish word. In any case, I was only giving an example of the kind of representativity of Greenberg’s sample.

  135. What’s with the 10-12 kiloyear time depth for the divergence of “Amerind”? That sounds like it correlates with Clovis at 11.5 kyBP (before the present), but Monte Verde is now solidly dated at 14.5 kyBP — and that’s in Chile, which means settlement of North America must have come even earlier.
    As for this n/m business, I see coincidence rather than either diffusion or inheritance as the most parsimonious option, especially as only one consonant phoneme is involved. See Nichols’s slides (PowerPoint) and Zompist’s Quechua-Chinese page.

  136. Woops, make that Zompist’s etc.. The above points to just part of it.

  137. marie-lucie says:

    JC: I can’t access Nichols’s slides, but I got Zompists’s etc’s, for which I thank you.
    One problem with looking for phonetic resemblances is that most people are looking for obvious resemblances, such as [k] in one language = [k] or [g] in another as Z suggests. But in actuality, between demonstrably related languages (either “sisters” or “mother and daughter”) one often encounters nonobvious resemblances, such as {k] in one = [s] in the other, as in French cent ’100′ where the letter c = [s] instead of [k] as in Latin centum, or Latin [f] corresponding to both Sanskrit [bh] and [dh], and there are many more even odder examples even in well-known languages, when a phonetic relationship between the relevant words would not have been suspected offhand but only emerged after other more straightforward ones had been identified.
    Of course, with most pairs or groups of closely related languages there is close enough similarity between large numbers of words that unusual but genuine correspondences are eventually recognized, but when casting around for phonetic resemblances between languages which are not at all obviously related (and which might or might not be), especially in the absence of morphological knowledge, the unsuspecting researcher might seize upon the few resemblant forms as proof of relationship, when too close a resemblance between a small number of items common to the languages in question might instead suggest borrowing. This is one of the pitfalls Greenberg fell into with his mass comparison.

  138. John Emerson says:

    The present Chinese syllable “shi” corresponds to an astonishing array of sounds from 1300 or 2700 years ago (These are standard points of comparison based on poetry rhymes). Sometime when I have more energy I’ll make a list.
    On the other hand, syllables pronounced something like “fang” 2700 years ago are almost all pronounced something like “fang” today (“fang” ot “pang”). The significance isn’t so much that they’re pronounced the same now as then, but that the words pronounced the same then are pronounced the same now.
    This would, of course, make the Greenberg / Ruhlan method worthless for Chinese.

  139. John Emerson says:

    As Dravidio-Pacific languages, it’s not surprising that Quechuan and Chinese would be closely related.

  140. m-l: I made a PDF file of Nichols’s slides here. This is not as easy to follow as the original PowerPoint, because each successive mutation to a slide generates a new page, and there is no color etc. So if the trouble was fetching the .pps file rather than viewing it, you can get it here.

  141. marie-lucie says:

    Thank you, JC. I remember that Power Point presentation as I was at the Alaska conference. I am not fond of statistics myself, but it is always good when different approaches come to the same result.

  142. Pre-Clovis is getting better documented, but on the other hand the case for farther back than 15k or a bit more is much weaker and not improving. The latest trend seems to be allowing for a coastal migration and an inland migration, both as the last ice age was receding. This may be influenced by Nichols’ finding circum-Pacific typological similarities that weaken in the eastern parts of North and South America. However it seems to me that language movement within one continent over the last 10000 years is much more likely than movement between already-populated continents with little or no land connection, so projecting the current east-west differences back that far should be a longer shot.

  143. Having now read some of Greenberg, his reasoning and reconstruction is much more complex and nuanced than one would assume from many specialist linguists’ criticism, and he discusses most of the objections that have come up. This should not be surprising given his seminal work in typology and universals that is acknowledged by all. I think he is well worth reading despite some degree of data errors. I expect his critics will still be critics after reading him, but will be better able to address his actual positions.

  144. marie-lucie says:

    caffeind: Actions speak louder than words: Greenberg’s theoretical positions (as presented earlier than his work on American languages) are fairly traditional, while his actual work on those languages (all derivative, since he used secondhand sources) is very poor. That is one reason why non-Americanists tend to side with Greenberg, because they go by his excellent and well-deserved reputation as a typologist, as well as the quality of his African language classification, while most Americanists were shocked and dismayed by the poor quality of his “Amerind” work because they expected so much better from him.

  145. marie-lucie, to Greenberg – or to protoworldism categorically – how is the distinction actually made, as a matter of method or ‘science’, between coincidental similarity and causal (“family”) unity and coherence in the case of a set (of any size) of similar-looking words from different languages?
    In other words, what are the principles of methodological rigor evident – even to the point of self-disclosure – in (say) Greenberg’s skepticism towards his own, or any!, provisional conclusion or ‘results’ of language unity? How does the scientificity of his investigations manifest itself?

  146. Marie-Lucie: Your point on related words not exhibiting surface similarity is a very important one to make. Even within language families younger than Indo-European regular sound change can yield superficially dissimilar words.
    Here’s a Romance example: French /foej/ and Spanish /oxa/ don’t have a single phoneme in common. Even if we knew nothing of the history of Romance languages, we could show the two words to have a common ancestor, because other Franco-Spanish word pairs show the same regularity (i.e. French initial /f/ = Spanish zero, for example). Deadgod: I hope the above answers your question.
    David: Proto-Indo-European seems to have had a *very* simple phoneme inventory: furthermore, its morphology seems not to involve any kind of suppletion. The latter feature has prompted one scholar, Kenneth Shields, to argue that Proto-Indo-European was an isolating language, with cliticized (rather than bound) grammatical morphemes, which became bound morphemes in the daughter languages.
    So: an isolating language, with a reduced phoneme inventory. That does sound like a pidgin. But we’ll probably never know for sure whether Proto-Indo-European was one.

  147. Greenberg didn’t publish Language in the Americas until 1987, but arrived at his view of Amerind much earlier, in the 50s. (He admits some of the outdated data is due to using his own notebooks from decades ago.) Conversely, his later articles on method reprinted in Genetic Linguistics are consistent with the earlier ones.
    Why didn’t he publish Amerind earlier? Maybe it is because his standards went down in his old age, or maybe it is because he anticipated the reaction. He actually says he anticipated heavy criticism, that Americanists’ standards are different than those of specialists on other regions, and that his methods were identical in Africa and America.

  148. I will here reiterate what I said at Jabal al-Lughat, with a tad less emphasis. Most of what Greenberg did in Africa was already known; he rearranged the branches on existing trees rather than creating wholly new trees. And Nilo-Saharan is as much a ragbag as Amerind, though obviously not as large.
    The specific rearrangements are good ones, to be sure: breaking up Hamitic into separate families, getting Bantu into the right part of the Niger-Congo tree.

  149. David Marjanović says:

    Greenberg’s African classification is now being questioned in that there’s little to no evidence for Omotic being Afro-Asiatic, and the evidence for Nilo-Saharan is weaker than Greenberg thought – Songhay might drop out, for instance.

    But why? if the pattern could be abandoned, it could also be borrowed by other languages (unlikely as both these scenarios seem).

    Both do of course happen when politeness accidents occur. English has lost thou, so it now has a /m/-/j/ system; the young generations of Thai speakers is said to like circumventing the elaborate politeness system by borrowing /mi/ and /ju/.
    Concerning the latter case, how many English nouns has Thai borrowed?
    There are similar phenomena. The rest of “basic vocabulary” (other than the personal-pronoun system, I mean) is also very stable, except when a dysphemism wave comes up, as it did between Latin and the Romance languages, wreaking havoc upon such basic words as “eat” or “head”. This even swept beyond Romance, leading to German Kopf “head”, a loan.
    (…Somehow I almost wrote “hat” instead of “head”. LOL.)
    It goes without saying that several dysphemism waves can come one after the other. The normal Latin word aqua survived into French as eau, but now we’re getting flotte instead (when deliberately speaking a low register); manger has a Latin ancestor even though that wasn’t the normal word for “eat”, but the potential successor is bouffer, which doesn’t have an ancestor – it’s onomatopoetic.
    It is, by coincidence, like in biology, where no character is immune to convergence. Some are on average more reliable than others, but all can fail.

    Proto-Indo-European seems to have had a *very* simple phoneme inventory:

    You think? It’s on the fricative-poor side of things, but not horribly so ( = less than ancient Ionian Greek) when the “laryngeals” are taken into account; the plosive system is rather impressive, with 9 velar plosives for instance, and then there’s the rare feature of having voiceless, voiced, and voiced aspirated plosives, but no voiceless aspirated ones.

    furthermore, its morphology seems not to involve any kind of suppletion.

    What about “I” and “me” (*eģoH2 and something with *me-, IIRC)?
    (BTW, there was a paper in a recent Diachronica issue that tries to trace this particular example down to Proto-Nostratic. I still haven’t read most of it.)
    But it’s true that the conjugation and declension is very regular. Looks like the endings hadn’t developed very long ago. Which leads us to…

    The latter feature has prompted one scholar, Kenneth Shields, to argue that Proto-Indo-European was an isolating language, with cliticized (rather than bound) grammatical morphemes, which became bound morphemes in the daughter languages.

    …the fact that at least some modern Nostraticists reconstruct Proto-Nostratic as a Japanese-like language that used clitics for its case-marking and IIRC the rest of morphology, too.
    I wonder if Shields has arrived at the same stage by internal reconstruction of PIE. I think “overshooting” like this is a common danger of internal reconstruction…

  150. marie-lucie says:

    DM: borrowing pronouns: But why? if the pattern could be abandoned, it could also be borrowed by other languages (unlikely as both these scenarios seem). – Both do of course happen when politeness accidents occur
    Sure, but elaborate politeness goes with a highly stratified society, and how many such societies were there during the probable time depth of “Amerind” in the Americas? Nahuatl (the language of the Aztecs, still spoken in several varieties in Mexico) has several honorific markers, but none of them are pronouns (they are verbal suffixes – the more the higher your interlocutor is). And once a pronoun has been made into a prefix or suffix (sometimes combined with another one in a single, undivisible morpheme), it is that much less likely to be borrowed.
    The normal Latin word aqua survived into French as eau, but now we’re getting flotte instead (when deliberately speaking a low register); manger has a Latin ancestor even though that wasn’t the normal word for “eat”, but the potential successor is bouffer, which doesn’t have an ancestor – it’s onomatopoetic.
    Neither flotte nor bouffer is particularly new. They belong to a lower register and coexist with more general or higher-register words. In the evolution of such cases, sometimes the lower-register words (especially if tabooed) last for centuries along the regular words, sometimes they rise in status as they become normal for the mass of the speakers and supplant the older words, which are eventually forgotten. Again, the existence of two or more registers in a language presupposes marked social differences, just as two or more dialects presuppose geographical differences. It is well-known that Proto-Romance (as opposed to Classical Latin) arose from the language of the masses of the empire, which included words from all sorts of origins, including not only foreign borrowings but many slangy metaphorical uses (hence for instance Latin manducare ‘to chew’ became Fr manger, Italian mangiare ‘to eat’ ).
    Structure of PIE:
    PIE is not just a reconstruct but a construct which results in an idealized picture of the language – it is not totally fictitious or misleading, but it has been airbrushed, as it were, out of necessity and out of the nature of the methods used to establish it.
    Since reconstructionists usually try to work backwards as far as they can in order to account for the variety of cognates of a single word or morpheme, it is not surprising that PIE paradigms are reconstructed to the point that they are entirely regular, and the affixes are not firmly bound to the stem (firm attachment usually causes morphophonemic changes, which reconstruction tries to reverse). It might be different if one could match various intermediate stages with known (or at least very plausible) historical periods, as in reconstructing Proto- Celtic or Proto-Germanic. In fact, reconstruction might not be of equal depth for different parts of PIE structure.

  151. John Emerson says:

    To make things easier to present maybe we should consolidate Basque, Burushaski, Amerind, Palaeo-Siberian, and Nilo-Saharan into one big Isolato family. That way will reduce all the various loose ends into one, producing a much more elegant theory.

  152. Thanks, Etienne. The questions were rhetorical, intended to draw out the principle of ‘science’ in historical linguistics from the (indispensable for any useful, albeit ultimately provisional, demonstration) quarrels about particulars. Do you think protoworldists are adequately rigorous in the skepticism they subject their investigations and (tentative) results to?

  153. David: I believe the only other suppletion that can be reconstructed back to Proto-Indo-European is the *s/*t alternation in the demonstrative pronoun (nominative masculine/feminine singular in *s-, *t- everywhere else). So allow me to correct myself: Proto-Indo-European exhibited *little* suppletion.
    Deadgod: I don’t. But I’m equally unimpressed with most mainstream scholars, for whom such things as the N/M “first/second person” pattern aren’t to be explained, but rather explained away.

  154. marie-lucie says:

    Once again, the N/M pattern is peculiar to some areas only (mostly along the Pacific Coast). It is widespread but not general in the languages subsumed under “Amerind”, so it cannot be considered a compelling characteristic of Native American languages apart from the two Northern groups (Eskimo-Aleut and Na-Dene). It is most likely that eventually a new language map will show “Amerind” split into several groups unrelated to each other, a state intermediate between the current classification with 120 obvious families thought to be completely separate, and the giant “Amerind”. This new classification will be reminiscent of the 6 North American “phyla” identified by Sapir in the 1920′s (which already recognized “Eskimo” and “Na-Dene”), although probably not identical to his proposals.

  155. marie-lucie says:

    caffeind: [Greenberg] actually says … that Americanists’ standards are different than those of specialists on other regions …
    In most cases, documentation of American languages is less than 150 years old (and in some cases, only a few decades old at most), so there is little in the way of firsthand documentation of language change such as texts written at different times (for some Meso-American languages there are documents from early Spanish missionaries, and only with the decipherment of the Maya script – a major accomplishment – is there now a respectable amount of older documentation, although the nature of the inscriptions means that the documents tend to be repetitive and the vocabulary limited – but the contemporary Mayan languages provide plenty of comparative evidence too). This means that in most American cases the documentation is “flat’, without much historical depth, unlike in most European and Asian languages or at least families. Many linguists who have been involved in American language classification and reconstructon attempts were or still are the first or only ones to have recorded the languages in question (and many of those languages have now become extinct or are severely endangered). This contrasts with IE or other Old World families where dozens of linguists very familiar with Greek, Latin and later Sanskrit, as well as with their modern descendants and related languages documented over centuries, worked on aspects of IE for decades, and were in a position to build on each other’s work. Therefore it is not really possible to compare the situation of the languages and of the scholars in the Americas and in Europe and Asia (the situation in Africa is probably intermediate between those two extremes, but I don’t know enough to comment meaningfully).
    The first attempt at proto-language reconstruction in the Americas was that of Proto-Algonkian, by the great linguist Leonard Bloomfield who had earlier specialized in Germanic comparison, and who proved that the “comparative method” could be applied under American conditions. But one advantage he had was that the Algonkian languages are a large and fairly homogeneous group which had been identified quite early, and for some of them there is also some historical depth of documentation. Even earlier, Edward Sapir, another towering figure in Native American linguistics, had discovered distant relatives of the Algonkian languages in the form of two small families on the California coast, geographically quite far from the nearest Algonkian language. His discovery (based at first on some common details of morphology) was greeted skeptically at first, until confirmed later by more detailed comparative studies. It would be absolutely untrue to say that Bloomfield did not have the same standards in Algonkian as scholars working in Proto-Germanic or Proto-Slavic, for instance, and Sapir’s discovery was more on the order of Sir William Jones postulating the IE language family with its common ancestor. Where Greenberg is right is that there has not been other scholars of such prominence in the field since, but Greenberg certainly has not filled that role with his “Amerind”.


  156. Ruhlen claiming
    that Mary Haas (1958) viewed Algic as already demonstrated by Sapir (1913), and that the only reason there was ever any doubt about Sapir’s obvious conclusion was one specialist’s (Michelson) narrowmindedness.

    Poser claiming
    Mary Haas had not viewed Algic as previously demonstrated, but that her 1958 paper proved it. Then Poser claims that even Haas (1958) did not demonstrate Algic either.

  157. Well, Etienne – if I understand you – sure: too much skepticism is as destructive of scientificity as not enough. In neither case is methodological rigor really a priority – not in comparison to bruiting one’s perspective ungoverned by anything so trivial as empirical compulsion which is also rational. (That’s no reflection of my opinion, far less a profession one, or the ‘truth’, of anyone on this thread, as far as I can tell.)
    My question remains (by way of adopting your vocabulary): what criteria would an intellectually scrupulous historical-linguistic explanation meet?

  158. marie-lucie says:

    caffeind: Thank you for those links to the articles. I said earlier (and others agreed) that proto-language reconstruction must rest on accurate language classification, which it confirms. You might think of classification as providing a preliminary sketch, reconstruction as a (never-) finished painting or building. Sapir sometimes jumped to conclusions, and he was right in many cases (including this one), wrong in others. Whether it was Mary Haas or others who finally demonstrated the “Algic” connection (Algonkian + Wiyot + Yurok) that Sapir had intuited is immaterial to the fact that those demonstrations used traditional comparative methods. And it does not take away from Sapir’s or Haas’s contributions to say that Teeter and others provided the final, conclusive details: they probably would not have even looked at Wiyot and Yurok if Sapir’s and Haas’s work had not pointed them in the right direction (and similarly, Haas might not have looked at those languages in that context if Sapir had not shown the way).

  159. Marie-Lucie: minor nitpick: Italian MANGIARE does not stem directly from Vulgar Latin MANDUCARE: rather, it was borrowed from Old French MANGIER, which itself stems from MANDUCARE (A few Italian dialects do have a native reflex, MANCA(RE) or the like, as does Romanian: A MINCA).
    Deadgod: an explanation needs to account for the data in such a way as to preclude any other explanation. The trouble with all too many long-range enthusiasts is that, as soon as they find non-trivial (i.e. too numerous to be due to coincidence) similarities between language families not known to be related, they immediately assume common ancestry to be the explanation. In practically all cases language contact is just as possible an explanation…but all the long-range work that I have read seems to (quite unjustifiably) either ignore the possibility or downplay it.

  160. marie-lucie says:

    deadgod: too much skepticism is as destructive of scientificity as not enough
    I agree. In present-day Americanist historical circles there is often too much skepticism, while the Proto-World people and even the Nostraticists are not skeptical enough.
    Too much skepticism means that you cannot entertain hypotheses because you demand absolute proof for everything right away, as in “language relatedness is either obvious or forever unknowable”, or you only think of the many possibilities for error which have trapped some of your predecessors. Too little skepticism means that you don’t even think of entertaining or exploring alternate hypotheses because you think that you are right and that you don’t have to take a closer look.
    The scientific attitude should be “if X is true, what should the consequences be? given circumstances Y, what could be the cause?” and then you explore the ramifications of your hypothesis. You should always keep in mind the possibility that you could be wrong, but that should not keep you from exploring the possibility that you might be right. You know that you are on the right track when some of the consequences of your hypothesis mesh with what you already know independently, or when your hypothesis turns out to solve an independently acknowledged problem. All this is demonstrated, for instance, in Vajda’s work discussed above.

  161. Vajda 2008 also tries to navigate between long range taxonomists and strict reconstructionists, say each play valuable roles, and give credit to Ruhlen for the first valid Dene-Yeniseian etymologies. Another source says it was preceded by a heated email exchange between Ruhlen and Vajda.

  162. marie-lucie says:

    I think that Vajda chose to bend over backwards in his published writings in order to soothe ruffled feathers and give credit to everyone who had had anything to do with the Dene-Yeniseian hypothesis.
    Ruhlen did not present “etymologies” (a word often misused but which implies tracing back the ancestry of words) but resemblances of form and meaning, most of which were “not” valid.

  163. Marie-Lucie: For the sake of the matter, Vajda did not present “etymologies” either. His comparisons are based on only a partial system of correspondences, some of which, if you look at them closely, are typologically implausible and / or are backed with too few examples (most often, two or three at best) to truly count as “regular”. And, obviously, where there are no or too few regular correspondences, we can hardly speak of “etymologies”.
    I thoroughly applaud Edward’s desire to apply the standard comparative procedure to Ruhlen’s comparanda, but when somebody says “Vajda has shown that 20 out of 30 of Ruhlen’s comparisons are accidental look-alikes”, I can only shake my head. The correct way is to say “Vajda has shown that 20 out of 30 of Ruhlen’s comparisons do not match up with the bits and pieces of systematic correspondences that he has proposed”. Note, too, that the correspondences proposed in Vajda’s original paper are already somewhat different from the ones he establishes in the revised version (I would humbly like to think that some of my criticisms had a hand in the matter). Therefore, there is no need to be so quick as to dismiss the rest of the comparanda in Ruhlen’s paper just because Vajda does not accept them. Who knows, maybe tomorrow he will. Fortunately, he himself is much more open-minded and cooperative than some of the people he set out to convince, and, likewise, he is not above looking at the relationship between Na-Dene and Yeniseian in a larger, Dene-Caucasian, context, either.

  164. Well, the Anglo-Iranian /bæd/ and the Anglo-Mbarabam /dog/ might not be coincidences; it’s possible that someone will establish them someday, although the betting is against it. The most we can say is that the preponderance of the evidence, as it is available to us, is for or against something.

  165. Etienne, I’d say the crux or fulcrum of your patient response to me would be the strength or softness of ‘preclusion’ to which some particular investigator would be willing to subject rival explanations (including tempering qualifications) – or, the elasticity of some particular investigator’s ad hoc application of ‘preclusion’ to rival theories (from quite tentative hypotheses to ‘master narratives’).

    Thanks, marie-lucie.
    I’d only add that “[not being kept] from exploring the possibility that [an unanticipated direction] might be right” sometimes requires more – [m-hm] commitment from the investigator than even that person might, looking from the outside at another person, consider ‘wise’.

  166. marie-lucie says:

    George, thank you for your comment. I remember you and your talk from the Athabaskan conference last summer. I am not an Athabaskanist, much less a Yeniseianist (?), so I am following the arguments for and against from afar, but with great interest.
    When I said that “Ruhlen did not present ‘etymologies’”, I did not mean to contrast him with Vajda on this point but to clarify the meaning of “etymology”, which seems to have acquired a different turn among Americanists than among Indo-Europranists, for instance.
    Being involved in language classification and reconstruction myself (though not in the same language area), I have had to consider what various people said and how they operated, both in principle and in actual fact. I wrote earlier that people forget how much time and how many people it took to reconstruct PIE to general satisfaction (although there are still many points of contention about the reconstruction, they are details and do not materially alter the overall pieture). But many people seem to expect reconstruction to be easy to do (which it is to a certain extent at a very shallow level, when the relevant languages are very closely related), and they also seem to expect that most reconstructions can be trusted to be accurate and taken as a point of departure for comparison with other reconstructions: unfortunately, it is often not the case. In PIE, there has been so much done already for so long, and the results taught to several generations of students, that there is a general consensus and body of data which one can trust (even though some reconstructions of the phonemic system disagree with the consensus, that does not affect the network of correspondences between individual woods). In less well-known groups and with more recent reconstructions, which in the Americas are sometimes the work of just one or two persons, I find that I am often skeptical of other people’s reconstructions, especially if the reconstructor(s) did not provide all the data on which the reconstruction is based, or if the reconstruction seems to violate known tendencies of phonological evolution (eg *c^ > k instead of the opposite, or, in a single language: *k > x and *X > q). I think that in the present case, some degree of discrepancy in reconstructions by several competent linguists well-versed in the relevant languages and also willing to learn from each other is the sign of a healthy situation and those discrepancies will eventually lead to a consensus and to further progress.

  167. marie-lucie says:

    deadgod, I often find your comments cryptic. In this case I think that you mean that the investigator of a new or controversial hypothesis has to have the strength of mind to keep on the chosen path while the same investigator might be skeptical of someone else doing the same thing, is that right? but the investigator of the hypothesis needs to do a lot more work than anyone else has before distilling the results of this work for other people, and if no one else has gone through the same steps and tackled the same materials and issues, no one else can truly evaluate the work (and therefore accept or reject it) until all the relevant materials, approaches and conclusions are exposed to the world. This is as true in comparative linguistics as in physics, for instance.

  168. George Starostin says:

    Marie-Lucie: I absolutely agree with every word you say here. The big problem is that, as you say, a good reconstruction takes a long time to carry out and the results are usually quite complex, unless we are talking about very closely related languages with little phonetic or lexical change – and this means that very few people will actually want to verify the results, and then the “consensus” will rely, more or less, on very subjective matters that often have little to do with the data itself – e. g. considerations of “personal trust”, “credentials”, “elegance and accessibility of presentation”, etc.
    In Vajda’s case, for instance, a lot depends on how correct his reconstruction of the Yeniseian verbal system is. Who can vouch for it? Either people who have also done work in the same sphere – like myself or Kirill Reshetnikov or a few other specialists in Moscow and Novosibirsk that remain completely unknown outside their circles – or people who have not done work on it, but are sincerely interested in the matter and want to read and analyze the works (I believe that anyone with a clear head and a will to study can theoretically be a referee). Johanna Nichols and Eric Hamp have said they are satisfied by Vajda’s evidence – how well have they actually studied it? My impressions tell me that they simply took his internal-Yeniseian evidence “on trust”: after all, he IS America’s top (and only) Yeniseianist, that’s just how things turned out.
    The problem is that there is just too little cooperation on different sides; we should all be willing to work with each other, not against each other. I was very glad to learn that Edward shares the same attitude (I hope to see him in Moscow this summer and introduce him to our working seminar). Unfortunately, lots of people do not, and prefer to spread silly myths and exaggerations about macro-comparative linguistics that make little distinction between the true “freaks” and those who actually want to make good scientific progress in the matter.

  169. David Marjanović says:

    Sure, but elaborate politeness goes with a highly stratified society, and how many such societies were there during the probable time depth of “Amerind” in the Americas?

    Yes, this is of course a strong argument against borrowing; I should have made that explicit. I was just nitpicking about how nothing is absolutely reliable and everything can be borrowed, even though it’s not equally likely for everything.

    Proto-Indo-European exhibited *little* suppletion.

    What about the 1st person plural pronoun (*/w/- in the nominative, */n/- elsewhere) and the 1st person singular verb ending (*-/m/ and *-h2 in different moods and aspects)?
    The latter suppletion, BTW, is put to impressive use in this paper:
    Fabrice Cavoto (2003): Supplétion et récurrence des thèmes pronominaux nostratiques, Diachronica 20(2), 229 – 258.
    The former has been suggested to result from a merger of a previous inclusive and exclusive we, with the */n/- form having been exclusive based on its resemblance to the negation (“I and others, but not you”) and on a bit of comparative Nostratic evidence.

  170. David Marjanović says:

    Then Poser claims that even Haas (1958) did not demonstrate Algic either.

    If one belongs to those who, out of principle, don’t accept lexical evidence alone. Poser doesn’t say if he’s one of them.
    However, he makes the important point that it’s not an either-or issue, but gradual accumulation of evidence resulting in a robust hypothesis.

  171. marie-lucie says:

    George, thank you for your reply, I agree with your general comments. If you write to me I will send you a paper.
    silly myths and exaggerations about macro-comparative linguistics that make little distinction between the true “freaks” and those who actually want to make good scientific progress in the matter.
    Yes, that is a problem at the moment: if one is interested in the possibility of “macro-comparative” linguistics (even very tentatively) it could be a career-destroying move in some circles, so better play it very safe rather than run the risk of being lumped with the lunatic fringe.
    David: a merger of a previous inclusive and exclusive we, with the */n/- form having been exclusive based on its resemblance to the negation (“I and others, but not you”) and on a bit of comparative Nostratic evidence.
    I have not read the article you mention (I don’t have access to it), but in my experience with a number of language families (not statistically significant on a world level, of course), forms for the 1st person plural tend to be quite varied in origin (more so than for 1st and 2nd singular, which are generally more stable, and for 3rd person pronouns, which often come from demonstratives), but I don’t recall having seen an origin in a merger of negative + other pronoun. Or does the author find other examples of this type of merger?
    [Poser] makes the important point that it’s not an either-or issue, but gradual accumulation of evidence resulting in a robust hypothesis.
    Whether in visual art or in language, some people easily recognize patterns, others don’t, or don’t believe in them (even though pattern recognition is crucial for language learning – small children being especially good at that -, and pattern matching for linguistic creativity, including some forms of language change). Some people can “connect the dots” even though many “dots” are missing, others can’t do so until almost every detail has been filled in. Of course, if there are only a few “dots”, they may or may not be connectable, or the connection might be the wrong one, but even if one is on the right track, people unable to see a pattern in an unfinished design will not be convinced until the design is almost finished, while others will be able to recognize a design at an intermediate stage. This is why a person who identifies a potential pattern must work to connect at least some of the dots that could be part of a design, before announcing a discovery, and also take into account other people’s proposals (which does not mean accepting or rejecting them out of hand without considering their merits). As more and more “dots” get connected in a way that people can follow the design (and even add to it, or improve on the connections), more and more people will come to recognize the design as probable or even valid.

Speak Your Mind

*