The World Atlas of Language Structures is a very interesting project which “is in preparation under the editorship of Bernard Comrie, Matthew Dryer, David Gil, and Martin Haspelmath.”

This Atlas will show structural features of languages in much the same way as linguistic data are displayed in dialect atlases. It will, so to speak, show us the isoglosses of the dialects of Human Language. We envisage an Atlas with about 100 structural features, each shown on a two-page global map and accompanied by a two-page description and discussion of the feature. To make areal patterns visible, each feature needs to be mapped for at least 150 languages, and ideally more than 200. In addition to the printed version, we envisage a fully searchable CD-ROM version.

A Guardian article about it doesn’t actually provide much information but does have this amusing quote:

Roland Kriessling, a linguist specialising in African languages, said: “In Namibia, there are many languages which sound completely bizarre to the western ear.
“!Xoop, for example, has different clicking sounds, including the tut, the horse’s hoof sound and the kiss. The phonetic complexity of !Xoop could put it into the Guinness Book of Records.”

Thanks for the link, Pat!

Addendum. One of the contributors to the Atlas wrote me as follows:

Somebody who commented on your post spoke of sparsity of data, and my honest opinion is that that is not a fair assessment. We all got a list of a core sample of 100 languages we were expected to investigate, plus another 100 we were strongly urged to investigate. For the chapters I worked on, I looked at every single of those 200 languages, plus over 100 more. We had access to experts on most of the 200 languages to make up for gaps in written documentation. I have seen a few chapters that indeed fall short of current standards in linguistic typology (there simply were too few languages in the sample), but most chapters are based on sufficient data, in my opinion. Of course you can always say that the picture isn’t nearly complete (it would take a large team and tons of money to investigate anything close to all of the Earth´s languages even for a single feature), but both in terms of topics and languages covered, I don’t think “sparsity” is a valid characterization.

Update (Aug. 2020). The World Atlas of Language Structures (WALS) now has its own site.


  1. I note that the 200-language sample recognizes three distinct groups of Caucasian languages, making that area of the world even more interesting than it had been.

  2. Interesting that the very first sentence of the Guardian article is wrong, or at least strongly misleading. Japanese also has many, many words for “you”, probably well more than 6 if you count every variation – kimi, anata, anta, omae,kisama, sochira, etc. Any of these words can be very rude in the wrong social context, and in the hands of unprepared gaijin, but to say that Japanese consider it rude to say “you” is a gross oversimplification.

  3. I saw Haspelmath demonstrate the Atlas at LSA/SI last week. A linguist told me that the copy Haspelmath was holding up was the only one in the Western Hemisphere.
    The software looks like it works. The major shortcoming is obviously sparsity of data. They have a couple of hundred features (like a typological questionnaire), and there is a core of a couple of hundred languages for which they have data on all the features. But each feature’s distribution was collected by an expert in that feature, with the result that outside the core, the language repertoire is highly variable. It would be nice if they had a path forward, involving distributing their questionnaire to experts in each language, so that they could get more exhaustive coverage.
    (P.S. I tried to use the word “speci*list” instead of “expert”, but your spam filter rejected it because it thought I was talking about ci*lis.)

  4. True, it is a gross simplification to say the Japanese find it rude to say ‘you’. But it is, I believe, true to say that the Japanese is a language which goes out of its way to avoid saying ‘you’ straight out.

  5. Funny, I don’t see neither Ukrainian nor Belorussian there…

  6. In fact, Russian is the only Slavic language. That’s presumably because the kinds of features they’re working with are identical for all members of the Slavic family… but then why do they have both French and Spanish and both German and English? Aren’t they as close as, say, Serbo-Croatian and Russian? Hmm.

  7. That is unusual, considering that Ukrainian has a synthetic future in place of the periphrastic Russian, and Polish has some interesting animate/inanimate variations.

  8. For Slavic languages, it would have been interesting to get a thorough overview of Bulgarian vs. others.
    With a sample of a mere 200 languages, I suppose there won’t be very many South Asian languages or Chinese languages/varieties/dialects.
    Also, it would be great to have ancient/extinct languages included. Imagine following syntax features from Hindi back through the ages, to find when (and preferably also from where) the ergative appeared. Sigh.

Speak Your Mind