The World Atlas of Language Structures is now freely available online:

WALS is a large database of structural (phonological, grammatical, lexical) properties of languages gathered from descriptive materials (such as reference grammars) by a team of more than 40 authors (many of them the leading authorities on the subject).
WALS consists of 141 maps with accompanying texts on diverse features (such as vowel inventory size, noun-genitive order, passive constructions, and “hand”/”arm” polysemy), each of which is the responsibility of a single author (or team of authors). Each map shows between 120 and 1110 languages, each language being represented by a symbol, and different symbols showing different values of the feature. Altogether 2,650 languages are shown on the maps, and more than 58,000 datapoints give information on features in particular languages.
WALS thus makes information on the structural diversity of the world’s languages available to a large audience, including interested nonlinguists as well as linguists who would not normally read grammars of exotic languages or specialized works by comparative linguists.

I’ve only had a chance to dip into this, but I look forward to exploring it at length. Many thanks to Casey (of Belletra) for the heads-up!


  1. I just put up a fairly detailed review of it at The Ideophone.

  2. Geez, this is fantastic. I only found one thing that irks me so far: the base maps for different subsections are different, and the one used for phonology leaves out a ton of North American languages. Lakota sure looks lonely all by itself. Of course, Siouan, Algonquian, and Iroqouian languages aren’t particularly phonologically interesting (especially compared to other American languages), but neither are European languages, and they don’t get left off the map.
    Still, though, this is amazing. I’m particularly enjoying finding which features show surprising areality and which don’t.

  3. SnowLeopard says

    The bibliographies also seem handy for the non-linguist who enjoys seeking out exotic grammars. They seem to have usually hit the key works for the handful of languages I checked. I’m surprised, though, that the source for the (extinct) Khoisan language \Xam is listed as “Anonymous 4” from the year 1000. Whatever the professional view (I don’t know it) of her skills as an amateur linguist in South Africa a century ago, I would have thought Dorothea Bleek’s grammatical sketch of \Xam and her massive Bushmen Dictionary would at least have been preferable to a fictitious citation.

  4. In the same week that WALS has come out with its 140+ maps, I’ve managed to come up with my first on Austronesian Number Systems covering Papua New Guinea, at:
    It obviously needs improvement; especially better line-drawing and some of those nifty little hyperlink buttons that refer to other pages.
    In the chapter on my speciality, Numeral (Bases), Bernard Comrie notes: “Non-decimal numeral systems are even more endangered than the languages in which they occur”.
    It’s worse than that, on my Philippine island, nobody knows the native words for any number higher than 5, and this is the case throughout the Austronesian area.

  5. I’ve just checked Thai language. Interesting site, it seems exhaustive but that’s pretty technical.

  6. (translator: Fixed that for you!)

  7. This is certainly a wonderful resource. Following your first link I feared it was going to cost me $595 to access it, but the second one confirms that it is indeed freely available.
    Of course, I realize that in a survey of this kind the authors are forced to use languages for which the information is readily available because someone has done the necessary research, but I was still a little disappointed in the cavalier way in which Provençal is treated. It appears in just one map, which tells us that it has a word for tea based on the Min Nan Chinese te. Given that it shares this feature with all of its Romance neighbours — French, Occitan, Catalan, Spanish, Romansch and Italian — it is difficult to regard this as its most interesting characteristic.
    Provençal actually has two features that I find more noteworthy. First of all, in the form formalized by Frédéric Mistral it is unique (as far as I can discover) among European languages in having invariant nouns, with no plural markers. Although it shares this to a limited degree with spoken French, it certainly doesn’t share it with written French. Second, it uses -o as the usual feminine suffix, which makes it unusual, maybe unique, among Romance languages.

  8. Good lord, your comment made me realize I’d neglected to add a link to the online site (aside from the “features” link)! Thanks for alerting me; I’ll fix that now.

  9. marie-lucie says

    The Provençal of Frédéric Mistral is but one of the varieties of Occitan.
    One problem with the current standardized Occitan (based on the speech of the Montpellier area) is that it has a very conservative spelling which approximates that of the language used by the medieval troubadours but is often at odds with what speakers are used to as sound-spelling correspondences. The feminine suffix written -a is pronounced [o] in the vast majority of the modern Occitan varieties (but apparently still [a] in Montpellier), which also use [s] in the plural of nouns. Occitan spelling (“la graphie occitane”) uses the letter o for the sound [u], but also for the sound of open (low) o if stressed (as indicated by a diacritic). Another feature of the graphie occitane is that it uses an etymological sequence of vowel + nasal consonant at the end of words where actual speech only uses a vowel (thus pronunciation is as in Catalan, a word which is pronounced by its speakers “katala”, but Catalan spelling follows the pronunciation rather that the etymology, so Catala with a diacritic on the last syllable). Since most of the people reading Occitan are literate in French, the archaizing graphie occitane makes it difficult to infer what the actual pronunciation should be, or to guess spelling from the pronunciation. For instance, I once read something written in French by an Occitan speaker, in which he referred to a kind of large cooking pot as topin, a disguise under which I at first had trouble recognizing what some members of my family (originally from Languedoc) called [tupi] and would have written French-style as toupi. A French person without any connection with the language would undoubtedly have pronounced the Occitan word written topin as if it were French, with the first vowel o and the second vowel nasalized, as in French. Among the necessities of life, my relatives would have counted [de pa, de bi e d’aygo] “bread, wine and water”, written in Occitan spelling as de pan, de vin e d’aiga. A sign of affection is [yn putu] ‘a kiss’, used in local French as un poutou but written in Occitan spelling as un poton.
    Not everyone (least of all true speakers) is happy with the Occitan standardized spelling which blurs dialectal differences and makes Occitan into a kind of “Esperanto” (as an old friend of my family expressed it).

  10. I had no idea about graphie occitane—thanks for a very enlightening comment!

Speak Your Mind