It’s a pleasure to be able to offer unalloyed praise for a NY Times story about linguistics, Michael Erard’s “How Linguists and Missionaries Share a Bible of 6,912 Languages.” I’ve been using Ethnologue in print form since I was in college (its availability online at no cost is one of the best things about the internet), and it was interesting to learn that it started as far back as 1951. There are some great quotes in the piece:

“I occasionally note in my comments to the press,” said Nicholas Ostler, the president of the Foundation for Endangered Languages, “the irony that Ethnologue’s total count of known languages keeps going up with each four-yearly edition, even as we solemnly intone the factoid that a language dies out every two weeks.”

This dissonance points to a more basic problem. “There’s no actual number of languages,” said Merritt Ruhlen, a linguist at Stanford whose own count is “around” 4,580. “It kind of depends on how one defines dialects and languages.”

The linguists behind the Ethnologue agree that the distinctions can be indistinct. “We tend to see languages as basically marbles, and we’re trying to get all the marbles in our bag and count how many marbles we have,” said M. Paul Lewis, a linguist who manages the Ethnologue database ( and will edit the 16th edition. “Language is a lot more like oatmeal, where there are some clearly defined units but it’s very fuzzy around the edges.”

The Yiddish linguist Max Weinrich once famously said, “A shprakh iz a dialekt mit an armey un a flot” (or “a language is a dialect with an army and a navy”). To Ethnologue, and to the language research organization that produces it, S.I.L. International, a language is a dialect that needs its literature, including a Bible.

I love the fact that he worked a Yiddish quote into a piece about a Christian organization, and remember, folks, you heard it here first! (I was wondering why I chose the spelling “diyalekt” in that entry, but it seems I picked it up from here; in any case, Erard’s version is indisputably better.)

Update. See now UJG‘s post, with an actual image of Weinreich’s original Yiddish.


  1. I come not to echo praise
    1: Ethnologue are the most outrageous splitters in the business, for (dubious) theological reasons, as acknowleged in the last sentence.
    2: There’s lots else that’s dubious about Ethnologue’s parent organisation (and for theological reasons), if rumours are to be believed.
    3: Merritt “Protoworld” Ruhlen a “linguist”, yet? He’s most of the reason NYT stories on linguististics bit the bag bigtime. (Allegedly.)
    4: What’s a “Yiddish linguist”? Yiddish was never a nationality, after all.
    5: I didn’t read it here first.

  2. des von bladet says

    “not to echo praise but to alloy it”.
    Bad comment box! No biscuit!

  3. Ethnologue is certainly circumspect about their underlying Bible-spreading mission — I used it a lot in recent research, and did wonder about the annotation in many language entries providing dates for Bibles published in the language, but never realized there was a Christian organization behind it all. I would agree that should not detract from the legitimacy of the data.

  4. The sponsors of Ethnologue are the Summer Institute of Linguistics and its sister organization, the Wycliffe Bible Translators. The latter name tends to be emphasized when fundraising in Christian countries, the former when proposing literacy projects to governments hostile to Christian evangelism.
    It’s easy to see why Ethnologue overestimates the number of actual languages or dialects in various parts of the globe. If, say, Indonesia prohibits the SIL from translating the Bible into a language the majority of whose speakers are already Muslim (and thereby off-limits to Christian evangelism), then SIL can sometimes find communities of still heathen speakers whose dialectal differences will acquire enough prominence to justify a translation. Perhaps Highland Konjo (mixed Muslim, Christian) vs. Coastal Konjo (Muslim) are such examples within the Makassar dialects/languages of South Sulawesi.
    In other cases the same language may be spoken on both sides of a national border, where the national languages on the two sides differ. If one gov’t is hostile to SIL, the other more welcoming, then the border-straddling language or dialect is likely to be carved into two. It’s easy to justify such decisions on such practical grounds as the different spelling conventions or loanword origins (say, Dutch vs. Portuguese) of the otherwise mutually intelligible languages. Sort of like Serbian and Croatian.

  5. The problem is that Ethnologue, due to its missionary background, is actively looking for new languages that have not yet been blessed with ‘god’s message’ in their own tongue.
    “Seek and you will find” Matthew 7:7
    Are you surprised that they always find new souls to be saved?

  6. Robert Staubs says

    Ethnologue does certainly split quite a lot, but I don’t see how that would detract from the value of their data. As with any survey, you have to be conscious of which groups are represented by each label. If you happen to be working with what Ethnologue designates as “Eastern” and “Western” varieties as one language then you will obviously not be able to use the data just as it is but it will still be useful.
    Data is all in the interpretation, which requires a lot of vigilence in the interpreter in this area.

  7. Yeah, parent organization dubious, mixed motives, blah blah, frankly my dear I don’t give a damn. I don’t see the United Atheist Linguists out there duplicating their work. If the only soup I can get is at the Muggletonian soup kitchen, I’ll eat Muggletonian soup and like it. Also, a “Yiddish linguist” is a linguist who works with Yiddish. Duh.

  8. Regarding splitting, it would seem that tiny dialects would benefit from being under the umbrella of a big literary language. If Germany had 42 Bibles for 42 German dialects, to me that would be a sign that the various German peoples were not a unity, and perhaps had been split up by their enemies in order to make them weak.
    At LH so-called “dialects” tend to be supported against “literary languages”, but a people or group of peoples with no unifying literary language is pretty vulnerable.
    Rebuttal to follow, I’m sure.

  9. Yep. The SILers are out there in the field documenting far more languages and dialects than university professors could ever dream of doing. And raising funds from private sources to do it, too. I benefitted from a whole lot of SIL data in my own dissertation research. And from a whole lot of older German missionary data, too.

  10. Thanks for pointing the site out. No thanks for making me miss my train.

  11. The Educational CyberPlayGround
    This is the site where the Yiddish quote was found but never cited.
    Language in Society 26:3 (1997), p.469
    by WB — William Bright
    It would have been nice have the Educational CyberPlayGround’s name mentioned so that others would know where to look for more information about Dialect Speakers.
    “(I was wondering why I chose the spelling “diyalekt” in that entry, but it seems I picked it up from here”
    Attribution goes to:
    Educational CyberPlayGround
    Linguistics Area
    I hope others will go over and check it out.
    Educational CyberPlayGround

  12. Richard Hershberger says

    What struck me about the article is, Merrit Ruhlen actually got a job in linguistics? At Stanford? Can anyone shed some light on this?

  13. What’s Educational CyberPlayGround’s problem? Is s/he angry because s/he was not cited? But there’s a link right there!

  14. What’s Educational CyberPlayGround’s problem?
    I’m not sure. Both LH and I cited the URL. I guess we didn’t mention the CyberPlayGround by name.
    I’ve just blogged a scanned JPEG of the paragraph from Weinreich’s article. The word dialekt is spelled without an extra yod.

  15. I don’t have any linguistics background, but is this in any way analogous to the biological debate about how many species there are?
    There, it’s said that there are two types of biologist: the “lumpers” and the “splitters”.
    The lumpers are those biologists who try to lump together as many sub-species into a single unit as possible; whereas the splitters are those who think that there’s at least two groups of lumper.

  16. bathrobe says

    It is indeed reminiscent of the biological debate, in particular the term ‘splitter’. And as with biology, the situation is a lot more complex than it appears on the surface.
    With species the classic touchstone is the ability to interbreed. With languages it is ‘mutual intelligibility’.
    But what happens when two populations are physically capable of interbreeding but don’t because they have discrete ranges? Perhaps this is similar to Malay and Indonesian (or even Dutch and Flemish), mutually intelligible but kept apart by politics and lack of contact (do Indonesians and Malaysians read each other’s newspapers? Do Dutch and Flemings?) Or when a species is capable of interbreeding but doesn’t because of different song or courtship behaviour? Perhaps this is analagous to Hindi and Urdu, or Serbian and Croatian, with their different alphabets.

  17. As I understand it, Malaysians and Indonesians do indeed read and hear each other’s news, since their own national media are often more constrained by their own governments than the neighboring media are. Plus, there are scads of Indonesians working in Malaysia, both legally and illegally. I suspect the two national languages are more mutually intelligible than many of the regional vernacular Malay dialects. At least that’s my secondhand impression.

Speak Your Mind