I’ve been enjoying these, and I thought I should pass them along:

Greater Blogazonia is subtitled “Language and Society in Greater Amazonia”; its creator, Lev Michael, says “My research focuses on Amazonian languages, and I am particularly interested in the strategic use of grammatical resources in interaction, language documentation and revitalization, and language politics.” He has long, meaty posts like Genetics meets Voodoo Historical Linguistics: Genetic Variation and Population Structure in Native Americans, discussing a study that “that sought to use information on genetic variation in Native American populations to develop and test hypotheses about the question of prehistoric migration in the Americas” but used some very dubious linguistic theories (with interesting comments by David Marjanović, who frequently shows up around here); the latest post is about a movie, The Linguists (trailer here), which “follows David Harrison and Gregory Anderson, scientists racing to document languages on the verge of extinction.” It’s been accepted for Sundance; I certainly hope I get to see it some day!

bradshaw of the future focuses on the history of words—mainly those derived from Indo-European, though the latest post features orange, which can be traced back only as far as Sanskrit नारङ्ग nāraṅga ‘orange tree’ (“The trail ends there, altho the AHD says ‘possibly of Dravidian origin'”). If you like etymology, you’ll want to bookmark it.


  1. Talking of films, I hope there’ll be an LH response to “Youth without Youth”.

  2. I didn’t realize Bradshaw of the Future had moved. I was subscribed to the RSS feed and hadn’t seen a new post for a while. I guess I hadn’t actually checked the site to see what was going on.

  3. “Bradshaw of the Future”, by the way, was a pseudonym used by one of the people who sent in answers to Lewis Carroll’s math puzzles in The Monthly Packet in 1880-1881; they were eventually reprinted in A Tangled Tale. As I recall, he usually got the answers right, too.

  4. I’ve read other articles that have tried to link genetics with linguistics, often with sketchy results. Thanks for pointing out Lev Michael’s; seems spot on.

  5. John Emerson says

    I recently saw an attempt to apply sophisticated physics models (math) to linguistic history. Basically they were just comparing alphabetical representations without any consideration of phonetics/phonemics or already-existing historical linguistics. I lost interest when it turned out that their model was unable to see that Spanish and Portuguese are about as closely related as any two distinct languages can be. To which: wow!
    Correlating language group and genetics should be very useful, though, in distinguishing elite-replacement from the migration of peoples. For example, genetically the British are descended from pre-Celtic Europeans, Celts, Romans and their mercenaries, Anglo-Saxons, and Norse. IIRC the genetic proportions are more pre-Celtic than expected, and less Anglo-Saxon. Likewise, the turks of Turkey are mostly descended from the pre-Turkish inhabitants. This work is very preliminary but I expect great successes. Prehistory has been spinning its wheels for some time working with its mix of linguistics, archeology, and textual records, and this new approach should be helpful.

  6. One major problem with applying statistical methods similar to those used in genetics to the question of language relatedness is that the languages of the world are very unevenly documented. For those of ancient times, only the ones with a large number of written documents which have been deciphered are fairly well-known, but there are many words which do not typically find their way in written documents, especially when the documents in question are quite limited in subject matter (eg traditional formulae on tombstones, or boastful inscriptions commemorating kings’ victories). In the contemporary world, many languages have already disappeared before being studied in enough detail if at all, and many others are severely endangered and will probably disappear within a decade or two if they are only spoken by elderly people. This problem is especially acute with the indigenous languages of North America but also affects many minority languages in other areas of the world.
    Even when such languages are documented beyond the most minimal amount, eg with a word-list comprising several hundred words, it is rare for all the words to be recorded and translated with the desired degree of accuracy. Most of us are familiar with the kind of short, skimpy dictionary which causes beginners to make mistakes ranging from amusing to horrendous because words are translated without any context. This also happens with a word-list made by someone who did not have the time or expertise to produce a more accurate work. Similarly, grammars resulting from a few weeks or even months of linguistic fieldwork are rarely definitive in their coverage and accuracy. From what I can gather, genetics deals not only with a large number of genes but also with the composition of individual genes, which are made up of many thousands of components. The total number of variables is not comparable to the number of words even in a large dictionary of English, whose vocabulary moreover is grossly inflated by words of foreign origin which greatly outnumber those descended from Anglo-Saxon, so that a statistical analysis of the vocabulary would not place English within the Germanic family.
    Also, word comparison is only a portion (albeit a large one) of the task of comparative historical linguistics. For instance, a famous Indo-Europeanist (perhaps it was Meillet) is reputed to have said that the main clue for classifying a language as Indo-European lay in the forms of the verb to be (because of their distinctive pattern of irregularity). He did not invoke a standard 100 or 200 word-list (commonly used in the Americas), or recommend mass comparison of selected vocabulary items. To cite English again, it is mostly the basic morphology rather than the vocabulary which places it in the Germanic family, and the morphological elements are too few to be amenable to statistical analysis.
    Certainly there is a place for new approaches in the classification of languages, and some awareness of the genetic history of the populations that speak them (if those peoples have been in place for a long time) could perhaps give some clues about unsuspected potential language relationships in populations that have migrated far and wide from a common origin, but it seems to me that the nature and number of the elements available for using statistical methods are simply not comparable in linguistics and in population genetics.

  7. Thanks, marie-lucie; your comments are always lucid and enlightening!

  8. Thank you, LH, for bringing up such interesting subjects and allowing your faithful readers to comment!

  9. Marie-Lucie: you are quite right, it was Meillet who claimed that the irregular forms of the copula (specifically, the opposition between third singular and third plural: French EST/SONT, German IST/SIND, Polish JEST/SA) were what would prove these languages to be related: but he seems to have had difficulty making up his mind, as elsewhere he claimed that we could only suspect, not prove, the existence of Indo-European if we had at our disposal data from the modern languages only.
    And I must confess I disagree wholly with your claim that it is mostly the morphology that indicates that English is Germanic: as Andre Martinet pointed out, a single word like WHAT (realized with initial /hw/), (compare Latin QUOD), exemplifies no fewer than three sound changes separating Proto-Germanic from Indo-European (*kw becoming /xw/, later /hw/; *o becoming /a/; and *d becoming /t/), and indeed a handful of English words (father, that, horn, deep, tooth, corn, bear, do, the lower numerals…) would suffice to show that English is Indo-European and underwent the Germanic consonant shift. Indeed I THINK that Greenberg, in defense of his mass comparison method, pointed to Tocharian as a language whose Indo-European filiation was accepted on the basis of the lexicon only.
    Of course, the difference is that Greenberg’s “method” can’t separate look-alikes from loans or actual cognates (assuming there are any of either), whereas even the few English words I listed above exhibit systematic matches with Indo-European (“tooth”, “two” and “ten” all have initial /t/, corresponding to Indo-European initial *d; “five” and “father” both have initial /f/, corresponding to Indo-European initial *p…and the list goes on!). Hence foreign loans in English can easily be detected if they exhibit a different set of sound correspondences: if we accept that English initial /f/ corresponds to initial /p/ in other Indo-European languages, as in the examples above, then a word like “faith”, which has no cognate in other Indo-European languages with initial /p/, could reasonably be suspected to have some other origin, and comparison with French “foi” and its Romance cognates would make the supposition of a loanword natural. In short: we could not only prove the Indo-European and Germanic filiation of English without examining its morphology, we could also, to a great degree, distinguish native from non-native lexical items.
    Where I DO agree with you, however, is that LH needs to be warmly thanked for bringing up such interesting topics and letting us discuss them. Many thanks, LH!

  10. marie-lucie says

    What you say is true, but if English had a morphology very different from that of Germanic (for instance, if it had the same verb roots but without inner change in root vowels and with a Turkish-like variety of verbal affixes unknown in IE languages), then the fact that the two languages (seem to) share common sound-shifts could be attributed to borrowings posterior to the occurrence of the sound-shifts. Witness the regular sound shifts between English and French in Old French borrowings into English, ex. the different pronunciation of ch and j in the two languages, or the vowel correspondences (some where English has changed, as in “long a”, and others where French has changed, as in oi). It is because of the copious historical documentation of the Norman conquest and its consequences that we can interpret the modern reflexes as posterior to the borrowing of Old French words into English, which was followed in both languages by separate sound evolution over hundreds of years.
    Of course, English, French and German (and some of their close relatives) have a long written history and there are lots of dictionaries and grammars available to help in assembling and analyzing data for comparison, so there is no need to debate the question of their relationships at this time, if there ever was, but anyone dealing with the classification of lesser-known, poorly attested languages with little or no historical documentation (such as many American indigenous languages) is faced with this type of problem. In that particular field (where one cannot rely on a 200-year tradition of historical scholarship in a long-established superfamily) there has been a tendency to concentrate on comparison of vocabulary and to consider morphology (usually interpreted as the study of affixes) as an afterthought or at best a confirmatory procedure, secondary to lexical comparison. This is OK when dealing with languages which are very similar in morphological structure as well as sharing numerous obvious cognates, so that it is self-evident that they belong to the same family (like, for instance, French and Spanish, or German and Swedish). Working in this field myself, I have found that when dealing with languages which might be more distantly related it is indispensable to consider all aspects of their morphology – after all, cognate words are not all short, unanalyzable roots but often come complete with inflectional and/or derivational morphology, and since the number of grammatical elements is much smaller than that of lexical items, these elements are likely to show up even where only a comparatively small amount of lexical data is available, causing many “empty boxes” in a grid for cognates across several languages.
    Close attention to morphology also helps defend against the many sources of error inherent in comparison which is largely limited to vocabulary (the latter including phonological correspondences). Again, these sources of error are not too much of a problem with languages which are already known through all kinds of criteria to be related, but they can loom very large when the question is to determine potential relatedness. For instance, among the basic vocabulary list you quote for English/Germanic cognates, words for kin terms and numerals, often offered as rock-solid evidence for Indo-European, are considered very unreliable in the Americas since there is evidence that such words have often been borrowed among unrelated languages. Applying this caution to IE, one could be wary of pater, mater vs. father, mother because of the widespread use of the syllables pa, ma in kin terms, but it is the suffix -ter (which shows up in other forms, not just kin terms) and its correspondences in many of the languages which would strongly suggest the common IE origin of those words in most of the languages, rather than they should have borrowed such terms between themselves.

  11. Marie-Lucie–
    We seem to be in agreement, although my point was that in an alternate universe in which no historical documentation was available it would not be difficult to demonstrate that English is Indo-European and Germanic (and CONTRA Meillet I think we could demonstrate the existence of Indo-European on the basis of the modern data only). I quite agree a mismatch between morphology and lexicon is most easily explained as borrowing, and I am aware of the fact that kinship terms and numerals are widely borrowed in many parts of the world.
    And speaking as an outsider, it is my impression that much if not all the work on languages in the Americas is marred by a persistent tendency to look for surface similarities between forms, rather than regular correspondences between superficially dissimilar forms: the relationship of French and Spanish may be obvious, but cognates such as French FEUILLE and Spanish HOJA are quite dissimilar, and can only be shown to be cognate by the fact that other French-Spanish cognate pairs exhibit the same sound correspondences (PEU/POCO for the stressed vowel, FILLE/HIJA for the consonants, for instance). Tellingly, proposed long-distance genetic relationships (in the Americas or elsewhere) almost always involve “cognates” that are suspiciously similar to one another, far more than FEUILLE/HOJA.
    In short: I agree with you that morphology should not be neglected, but you should not assume that failure to unearth long-distance relationships between language families in the Americas is due solely to this neglect: rather, in looking for surface similarities between lexical items only, scholars have probably blinded themselves to all but the closest relationships between languages. Should you ever suspect a genetic relationship on the basis of morphological evidence, I would venture to suggest that a re-examination of the vocabulary, with the goal of detecting regular correspondences instead of surface similarities, might prove quite fruitful.

  12. marie-lucie says

    My sentiments exactly! Let’s shake hands.

  13. bradshaw of the future has gone to the bit bucket, alas, but the Internet Archive has it. There was no need to be tentative about the Dravidian origin of orange, since the OED had already revised that etymology in 2004 and accepted it without any doubt or qualification: “Sanskrit nāraṅga < a Dravidian language”.

  14. Thanks, I’ve replaced the links with archived ones. And I’m pleased to see Lev Michael is keeping his blog up, though it’s been renamed; as his About page says: “This very neglected blog was named Greater Blogazonia until I started writing here again in 2021. No post or comments were deleted when I renamed it Amazonian Linguistics.”

  15. David Marjanović says

    Alas, the last comment on that blog (which brings back faint memories) was posted on April 17, the last trackback on May 29, and the last post on November 8 – all of 2014.

  16. I should’ve also noted, AHD revised its Word History box for orange in the 2011 edition, so that it now states without reservation that “The ultimate origins of the word lie in the Dravidian language family”, and also gives the modern Tamil nāram, one of the cognates that bradshaw of the future mentioned. I’m impressed with AHD’s thoroughness in rechecking stuff between editions.

    They did forget to delete “possibly” from the etymology. That happens on deadline. If they’d hired me to copyedit, I might have caught that, given enough time — and it probably still wouldn’t be published yet.

  17. Alas, the last comment on that blog (which brings back faint memories) was posted on April 17, the last trackback on May 29, and the last post on November 8 – all of 2014.

    I’m happy to report that he started posting again this year (in January); the latest at present is the very interesting On the meanings and etymology of ‘anhinga’: The devil is in the details.

  18. Well, in that case I wonder why “possibly” was removed. Also Dravidian is only the family to which we can trace it, “the ultimate origin” can be arbitrarily farther.

Speak Your Mind