Bill Poser at Language Log has an extremely useful post in which he goes “beyond the Ethnologue” (the best quick reference for language families) and cites books that give reliable information about language relationships for Africa, the Americas, Australia (I’m delighted to see Claire Bowern namechecked!), and New Guinea. This is the sort of service the Log should be providing (alongside its vigilant search for snowclones); who better than linguists to point people to accurate information about languages? Perhaps someone will weigh in here on similarly reliable books that cover other areas, for instance East Asia.
Addendum. See now Bill’s follow-up on exactly why Merrit Ruhlen’s approach to classifying languages is worthless.


  1. Classifying languages is sometimes easy and sometimes a tricky business. For example, it is obvious that Romanian is a Romance language and Yiddish a Germanic language even with the numerous Slavic borrowings e.g. lopatã ; lopate “shovel”; cus,mã ; kutschma “fur hat.” . On the other hand, it wasn’t until 1854 that philologists (chiefly Franz Bopp) verified that Albanian was an Indo-European language. It was a century later before linguists discovered that Thai was related to the Maylayo-Polynesian group of languages. It has so many Chinese loan words like Vietnamese, that it was at first, thought to be a Sino-Tibetan language. Furthermore, Gothic does not appear to have been quite ‘Germanic’ and there are a few linguists who doubt the membership of Cherokee in the Iroquoian group and would rather reclassify it as ‘Hokan-Siouan’ (even though Iroquoian is an offshoot of Proto-Hokan-Siouan, spoken about 3,000 years ago, along with Sioux, Caddoan and Hokan).
    I was pleased that Ethnologue mentioned the Cappadocian and Pontic varieties of Greek formerly spoken in Turkey (I believe that Pontic still has a few speakers there) but unfortunately, they don’t mention Griko-Salentino a form of Dorian Greek still spoken by about 25,000 people in Southern Italy. I would consider Merritt Ruhlen a good authority on language classification any day dispite his unpopularity with some linguists. He has even been published in “Scientific American.”

  2. I’m afraid that the fact that Scientific American has published Merritt Ruhlen is an indication of the incompetence of Scientific American in matters linguistic rather than an endorsement of Ruhlen’s linguistics. Ruhlen isn’t unpopular with just “some linguists”. Rather, his approach to classification is rejected by the great majority of historical linguists. Not only is his method of demonstrating genetic relationship considered unreliable, but serious problems have been found in the data he uses. Moreover, he does not have any technique for classifying languages. In order to classify, it isn’t sufficient to be able to show a relationship – you’ve got to be able to show degrees of relationship – and Ruhlen has never explained how he can do that or offered evidence for his claims.
    Incidentally, the relationship of Thai to Malayo-Polynesian is not generally accepted, nor is a relationship between Hokan and Siouan or Hokan and Iroquoian.

  3. “Furthermore, Gothic does not appear to have been quite ‘Germanic'”
    Really? Must brush up, then. Is there any online info on that?

  4. Ethnologue may be allright for language classifications, it contains numerous errors in describing languages.
    I have checked myself their entry on Flemish while Danish and Italian colleague told me the information on their languages is erroneous too.
    As far as Flemish is concerned, their entry is totally outdated and incorrect.

  5. Furthermore, Gothic does not appear to have been quite ‘Germanic’
    Huh? Explain, please?
    Also, what Bill Poser said. I’m afraid Scientific American can hardly be considered a keeper of the flame of accuracy in linguistics.

  6. For South-East Asia, there’s South-East Asia : languages and literatures : a select guide. I am by no means an expert, so I would love to know whether it’s considered any good. It goes country-by-country (plus the Chinese diaspora). Not much depth (it’s a thin book), but a bibliography for each.
    It’s not nearly as popular in LibraryThing as Bill Poser’s other recommendations (both positive and negative). I don’t know whether that’s a bad sign.

  7. Harold’s and my book (thanks for the mention!) isn’t exhaustive by any means. I can’t actually think of any generally available classification that’s accurate, although there are several works in progress. The Wikipedia page isn’t too bad. The Pama-Nyungan section is here.

  8. And here’s the main entry: Classification of Australian languages. Not perfect, but not too bad.

  9. And third comment lucky – actually I take that back about Wikipedia’s general Australian classification.

  10. Ian Myles Slater says:

    I’m also trying to figure out how there can be a problem with the classification of Gothic.
    Ulfilas’ translation from the Greek Bible is the main source of Gothic grammar, so it isn’t exactly a pure sample of “native speech.”
    And Gothic, being not only older by some centuries, but East Germanic, naturally doesn’t fit perfectly alongside of any of the West and North Germanic varieties attested later.
    Which is sort of the point of classifying them as belonging to different branches and periods!

  11. There have been some doubts as to whether the language of Ulfila’s Bible really reflects the actual language of the Goths (e.g. Graeme Davis in “codex argenteus: lingua gotorum aut lingua gotica?” Journal of Language and Linguistics, Volume 1 Number 3 2002), but to my knowledge, noone has disputed that Gothic is a Germanic language.
    My issue with Ethnologue is that of classifying Jewish languages, especially Judeo-Arabic. Then again, though it is not perfect, I very much like its classification of creole languages.

  12. Ian Myles Slater says:

    Yes, Ulfilas is less than satisfactory as a sample of Gothic. I, like others, was trying to figure out what could possibly have produced an earlier comment that “Furthermore, Gothic does not appear to have been quite ‘Germanic’.”
    I was trying to be very generous about how the only continuous Gothic text we have being itself a translation (in places a translation of a translation) might produce an initial impression in the curious that Gothic syntax was aberrant.
    Frankly, I would suppose that the confusion arose at second, third or fourth hand, instead. Possibly from something as simple as a comment that Gothic is not the ancestor of any living Germanic language.

  13. The fundamental problem with Ethnologue classifications is that they are not clearly sourced. They aren’t original either, so how good they are depends entirely on how good their sources were – and their sources vary substantially from region to region. Ruhlen 1991, no matter what you think of his own methodology, has a big advantage over the Ethnologue: he sources every classification, and where the subclassification of a family is controversial, he regularly gives several different trees indicating different historical linguists’ theories.

  14. If you’re going to investigate further, Ruhlen is useful because of the sourcing, but for someone who is looking for a first-order reference, I would recommend the Ethnologue. Better than either are some of the reference books dealing with particular regions and language families that I suggest over at Language Log.

  15. Re: “Furthermore, Gothic does not appear to have been quite ‘Germanic’Huh? Explain, please?
    Also, what Bill Poser said. I’m afraid Scientific American can hardly be considered a keeper of the flame of accuracy in linguistics.”
    — Language Hat
    Hi, Ya’ll. Let me star out by saying that taxonomies in linguistics appear to be much the same as those in biology, botany or astronomy. While some classifications are certain others are not. I remember about 35 years ago when most scientitsts thought that the panda bear was closer related to the racoon than the bear. Now they’ve decided it is a member of the bear family (Ursidae). See following link if interested:
    I also remember about this same time (early 1970’s)that most anthropologists decided that Neanderthal was not a separate hominid species but merely a crude form of Homo Sapiens. Since that time, they have decided that he was a separate species again – Homo neanderthalensis . Bats were likewise put in the Primate family but have since then been taken out and put back into Vespertilionidae.
    Twenty five years ago, astronomers were pretty sure about what a “star” was, what a “planet” was and what a “moon (or satellite)” was but the discovery since then of so-called “brown dwarfs” and “Kuiper belt objects” to which Pluto and Charon may belong to makes them unsure.
    Many botanists and biologists now believe that everything in nature is on a continuum and that unfortunately, not all plants and animals on the earth can be neatly packaged into the old taxonomies first proposed by Charles Linaeus in the 18th century: phylum family, genus, species etc. Today, astronomers are more confused than ever about what constitutes a “planet’ though a few have said that they don’t consider it important to their work.
    In languages we encounter a similar phenomenon even though once again it may still not be a major problem for people doing work in this area like Mr. Poser.
    Nevertheless, the problem with classifying a language like Gothic as “Germanic”, pure and simple, is that Gothic shares numerous isoglosses and morpho-syntactical features with other Indo-European languages too outside the Germanic branch. For example, aqizi “ax” is just as close to Ancient Greek ‘axinê'”ax” as itis to the modern English word; nauths (need )is very close to Old Prussian nautei ” of / to the need”; hlaiba “bread” is actually closer to Russian khleb “bread” than modern English bread and German brot; bruth-faths “bridegroom” contains an element pats / pets also found in Tocharian and it means “husband.” (V. Orel 2003)and Alakjô “together” is a cognate with the Greek Ho laos opas “All the people” (Balg 1889). These examples are just the tip of the iceberg.
    The same kinds of problems also exist with classifying Latin as “Italo-Celtic”; A few linguists even claim that it is different enough from Oscan and Umbrian that it shouldn’t even be classified as “Italic” but as an independent memeber of the Indo-European family much like Albanian, Greek and Armenian.
    Merritt Ruhlen has a reasonable hypothesis: If Africa was the cradle of mankind (the human race) as most anthropologists now believe, and that all humans in the world descend from a single woman in Africa(“Eve”) who lived about 150,000 years ago then somewhere back in time we all spoke a common language. Every language spoken in the world today is a descendent of this proto-human language.
    I’ve never heard anyone dispute “Scientific American'”s reputation on any subject. The only negative criticism about it I’ve heard is from one co-worker who told me that he gave up on it because he “found the articles too difficult to read” and another who told me that I shouldn’t be reading it because it was an “un-Christain publication.” The magazine is not a new comer to the linguistics field. It has been around since 1845 and was publishing articles on the subject before any of us were even born. I only regret that it doesn’t publish more articles on anthropology and linguistics and has become too devoted to the physical sciences in recent decades.

  16. Owlmirror says:

    Bats were likewise put in the Primate family but have since then been taken out and put back into Vespertilionidae.
    This is, I believe, a misstatement. It was suspected that bats were related to primates, but I don’t think there was a strong movement to classify them as primates.
    Coincidentally enough, Pharyngula recently posted about bats, and a comment to that post mentioned that it was suggested that bats and primates could be grouped into the same superorder, Archonta. However, genetic tests determined that bats and primates were less closely related than it appeared, and that classification was rejected.
    Molecular Phylogeny of the Superorder Archonta

    Phylogenetic analyses of the sequences give evidence that primates, tree shrews, and flying lemurs have a recent common ancestor but that bats are genealogically distant.

  17. Owlmirror says:

    hlaiba “bread” is actually closer to Russian khleb “bread” than modern English bread and German brot;
    I am sure that those with more linguistic experience already know this, but I always understood that the original Anglo-Saxon word for bread was “hlaf”, cognate with modern English “loaf”.
    The OED etymology for loaf says:

    [Com. Teut.: OE. hláf masc. = OHG. and MHG. leip, inflected leib-, bread, loaf (mod.G. laib, also written leib, loaf), ON. hleif-r loaf (Da., MSw. lev), Goth. hlaif-s bread […] :OTeut. *hlaiƀo-z.
      Whether the sense of ‘bread’ or that of ‘loaf’ is the earlier is uncertain, as the ulterior etymology is obscure. For many doubtful conjectures see Uhlenbeck Gotische Etymologie s.v. hlaifs. Some have suggested connexion with OE. hlífian to rise high, tower, the reference being supposed to be to the ‘rising’ of leavened bread. Outside Teut. the following synonymous words are certainly in some way connected (most probably adopted from Teut.): OSl. χlěbŭ (Russian khleb), Lith. klêpas, Lettish klaips, Finnish leipä, Esthonian leip. It has been supposed by some that the initial element in G. lebkuchen, lebzelter, gingerbread, is an ablaut-variant of this word.]

  18. There was at least one scientist in the 1960’s who entertained the idea of reclassifying bats as primates, partly on the grounds that they have five fingers, although I don’t remember his name. This was of course, before we knew much about DNA. Still, bats are considered some of the closer relatives of tree shrews and primates in class-mammalia despite the fact that early scientists thought they were closer related to rodents.

  19. Mike Maxwell says:

    > I’ve never heard anyone dispute Scientific
    > American’s reputation on any subject.
    Hard science, maybe not. But they do appear to have a “position” (often politically influenced), and most of what they have written on linguistics is, IMHO, borderline. You might also look at their Y2K article, done in 1979, with its best-case and worst-case scenarios, a typical example of what I would call a politically influenced article. It makes the White House’s errors on weapons of mass destruction in Iraq look trivial.

