Researchers at the World Oral Literature Project have compiled a database of language endangerment, described in a Cambridge press release:

An open database of endangered languages has been launched by researchers in the hope of creating a free, online portal that will give people access to the world’s disappearing spoken traditions. …
It includes records for 3,524 world languages, from those deemed “vulnerable”, to those that, like Latin, remain well understood but are effectively moribund or extinct.
Researchers hope that the pilot database will enable them to “crowd-source” information from all over the world about both the languages themselves and the stories, songs, myths, folklore and other traditions that they convey.
Users can search by the number of speakers, level of endangerment, region or country. … Of the 3,524 languages listed, about 150 are in an extremely critical condition. In many of these cases, the number of known living speakers has fallen to single figures, or even just one.

Here‘s a Telegraph story about it by Andy Bloxham, and here‘s a column by Dr. Mark Turin, director of the project, who says:

We will only succeed, however, if the project is of use and interest to indigenous communities themselves. While Cambridge may be the location where materials are hosted and maintained, both physically and digitally, communities will require copies of the output so that future generations can access and understand the cultural knowledge and language of their ancestors.
Generations of anthropologists have had the privilege of working with indigenous communities and have recorded volumes of oral literature while in the field, but many of our colleagues have not known what to do with these recordings once they finish analysing them. The World Oral Literature Project can provide a way for the material that has been gathered to be preserved and to be disseminated in innovative ways, when that is ethically and culturally appropriate.

(Thanks, peacay and Paul!)


  1. The database has an entry for Europanto, a language that doesn’t exist and never has existed, escept as a joke. Kind of throws some doubt on the general quality. The source of that error is – of course – Ethnologue. How can anyone take Ethnologue seriously? Lots of data, true, but very low quality. I just rechecked the Ethnologue entry about Swedish. Still filled with the same nonsense as it has had for years. And Esperanto is listed as a “language of Poland”. I think it used to be a “language of France”…

  2. The language in Papua New Guinea on which I did fieldwork in 1976, and whose texts I am now faced with retranscribing from pencil to digital media, is incorrectly listed as Numbani rather than Numbami (though Ethnologue has it right), also known as Siboma or Sipoma. It only had 300 or so speakers in 1976, so is correctly (in my view) classified as Vulnerable, but still in general use, as far as I know, especially in the village context.

  3. SnowLeopard says:

    These comments reinforce my own concern, which is that the database’s list of “existing resources” is rather poorly researched at this point. Yup’ik and Navajo, for example, are relatively well-documented languages, as attested by commercially-available books and recordings sitting in my basement. But this database might lead you to think there was nothing out there.

