This article by William O. Beeman (Department of Anthropology, Brown University) fuels my worst suspicions about anthropologists’ shaky grasp of the concepts of linguistics; the first two sentences indicate a distilled confusion such as is rarely achieved: “One of the bedrock principles of linguistic analysis since the nineteenth century has been the principle of the regularity of cognate borrowing. It forms the basis of the ‘comparative method’ not only in linguistics, but in all of social science.” A little later comes a paragraph with a more expansive form of the confusion:

However, there is a limited, but powerful countervailing tendency in language behavior—words that absolutely resist borrowing even from their closest linguistic relatives. These words seem to be coined anew by each population group. Because we expect cognate borrowing as a norm, it is surprising when we encounter these fascinating examples. It makes us wonder about the cultural processes that govern the development of communication systems, and the functional differences between segments of vocabulary.

This leads into his main point, the often-observed fact that most languages, even closely related ones, have entirely different words for ‘butterfly.’ He then lists all those he has collected, which is what makes the page worth linking to. Ignore the miasma and enjoy the variety of (often semi-onomatopoeic) words, to which I will add cipelebesha (Bemba), balanbaalis (Somali), ekiwojjolo (Luganda), ihe n’efe-efe or uru baba (Igbo), lolo (Malagasy), vannatti pucchi (Tamil), titernig (West Armenian), peperuga (Kalderash Romanes [“Gypsy”]), fepule or minni or tirtirk (Kurmanji Kurdish), metelik (Ukrainian), palomma (Neapolitan Italian), kubelek (Tatar), kapalak (Uzbek), göpölök (Kyrgyz) (notice the interesting variations among these three and Turkish kelebek), khovagan (Tuvinian), pepela (Georgian), polla (Chechen). A few corrections: according to my Basque dictionary, the first word should be tximeleta (rather than txipilota), the Hausa word is malam-bude-littafi (first and last a‘s long), and the Latvian word is taurinš (not tauriųš); furthermore, there is no such language as “Senegalese” (anybody know what language lupe lupe is from?). And a couple of alternative Zulu terms: ijubajubane and itwabitwabi. What fun!

For further amusement, I reproduce here the last thing on the page, a footnote quoting a mind-bogglingly lunatic theory of “universal word derivation” that Prof. Beeman apparently takes seriously:

2. However, Isaac Mozeson, author of The Word, a treatise on common word origins, contributed this commentary based on his own theories of universal word derivation:

I had in my “PYRALIDID” entry (appendix A) the PR Greek, the PPL Latin, the Malay PPL and the Nahuatl PPL terms for butterfly. All should be influenced by Hebrew PaR PaR (butterfly) and the PR root of PiRPooR (to twitch). I am grateful for the Tagalog paruparo, and would like to credit the contributor. As for the Paiwan/Taiwan term, two phonemes are at work. One, kali, could be like Hebrew KAL (light, swift), and the other is a duplicated dungudungul, which appears to be a nasalized DIGDAIG (Hebrew for the tickle-like wavering motion of DAG (fish) and DeGel (flag). Needless to say, TICKle itself is a form of this Daled-Gimel root from Edensprach. Lastly, the Autronesian KUPO root could be a form of Ayin-Peh, KHuPh (to fly—see “AVIATE” in THE WORD, p. 26).

(Via wood s lot.)


  1. Right you are about the addictive properties of searching the text in Amazon — Basque guru Larry Trask has some comments about tximeleta.
    It is fun reading lists like this. And it would be great fun to try to build a new one collaboratively with all your readership, don’t you think? I imagine the collective collection of dictionaries on the shelves of Languagehat readers is formidable indeed!

  2. Great idea! All contributions welcome; put ’em in the comments.
    And a great link—the Amazon effect strikes again!

  3. …furthermore, there is no such language as “Senegalese” (anybody know what language lupe lupe is from?)
    Well, the principle language of Senegal is of course Wolof. (I don’t know why I say “of course”, it’s only by coincidence that I know this, although it would not be difficult to find out). The only online dictionary I could find doesn’t include an entry for butterfly, but I did turn up some vocabulary resources for students of the language, which give the Wolof for butterfly as lëpp-lëpp bi. Looks pretty similar to me, although whether it’s the same word distorted by orthography and/or dialect, or a cognate/borrowing in a seperate language, I can’t tell.

  4. Yeah, I know Wolof is the main language, and I even have a (French) dictionary, but I couldn’t lay my hands on it, and “lupe lupe” didn’t sound like Wolof. Lëpp-lëpp bi is more like it—thanks for the contribution!

  5. Some anthropologists have a pretty decent grasp of linguistics …

  6. I think that it is worth noting that Beeman is not himself a historical linguist, and that some of the work he cites, such as that of Joseph Greenberg, is well respected – even if controversial. This particular post of his seems to be more of a hobby project than anything to do with his main research (on discorse in Iran). Beeman’s writings on Afghanistan have been invaluable during the past few years.
    I know there is a running theme on these pages attacking any attempt at historical linguistics, but there are some interesting (even if often over-interpreted) relationships between historical linguistic work and research in genetics. While I agree that attempts at constructing an “ur-language” are misplaced, there is much to be gained using linguistic data to supplement other (cultural and genetic) data for the study of migrations and other related questions.

  7. Kerim: I’m being somewhat tongue-in-cheek about anthropologists; of course I’m aware there are ones who know their linguistics (and indeed American linguistics was founded by anthros), but my anthro teacher in college was a complete ignoramus, so I take pleasure in poking them once in a while. No offense meant. And I certainly wouldn’t assume anything about Beeman’s main work based on this sideline; if you say his Afghan stuff is valuable, I’ll definitely check it out.
    I would never attack “any attempt at historical linguistics” (that having been my field), I just think many people have no idea how limited its range is and try to use its concepts (or concepts that they imagine apply) to reach farther back than is possible or to back up theories about (say) cultural development or migrations that linguistics can’t really help with.
    Pat: Why don’t you put up a page where we can keep ‘butterfly’ words? Then we won’t have to visit the one with all the silliness (and we can control the entries — I’m dubious about some of his).

  8. Sorry if I overreacted – I was feeling a little defensive this morning …

  9. Here are a few more butterfly words (all from the net – I’m afraid I don’t have as much in the way of exotic dictionaries as I’d like).
    First off, another existing list. It’s shorter than Beeman’s, but it does include some languages missing from his, and disagrees on some common to both, so it might be useful. A much shorter list here provides a little etymological information.
    In a post on the Constructed Languages list, Dirk Elzinga (mentioned in this Languagehat entry) provides butterfly words in several Uto-Aztecan languages (and Navajo):
    Shoshoni: waayapputunkih
    Kaibab Paiute: aïcïvïtsi (ï = barred-i, c = long-s)
    Luiseño: avéllaka
    Hopi: povolhoya (monarch: hookona)
    Navajo: k’aalógii
    (Note that this not what Beeman gives for Navajo.) See this followup post for a morphological breakdown of some of these.
    (Finally, on one of the Wolof sites I linked to above, I see that ë is a schwa, and pp is aspirated. It’s noted thaat some of the other doubled stops are followed by a short epenthetic vowel when they occur finally. This being the case, I can imagine an orthography that would render lëpp-lëpp as lupe lupe.)

  10. A butterfly-like word in the medical-supply world is “tape”. One international 8-language box of tape was labelled “tape/ruban/cinta/sparadrap/pfleister”. There were only 8 languages and Japanese was one of them; pfleister and sparadrap, as I remember, each had one cognate.

  11. Thanks! We’re accumulating quite a swarm of butterfly words here.

  12. Back when I was a budding polyglot, I noticed the similarity of the Hebrew/Arabic, Romance and Latin words for butterfly mentioned in the quote above. An “intermediary form” that the author has missed: Welsh pili-pala (if I remember correctly).

  13. Here’s a list I compiled for my book “A Desert Bestiary” (Johnson Books, 1997). It’s got a few languages not represented in the other lists.
    Acoma, buh’rai
    Arabic, farasha
    Dutch, vlinder
    German, Schmetterling
    Greek, petalou’da
    Hebrew, parpar
    Hungarian, lepke
    Italian, farfalla
    Japanese, chou
    Kyaka Enga, maemae
    Latin, papilio
    Lushootseed, yubec
    Maltese, farfetta
    Nahuatl, papalotl
    Ndumba, kaapura’rora
    Osage, dsithato’ga
    Polish, motyl
    Portuguese, borboleta
    Romanian, fluture
    Russian, babochka
    Spanish, mariposa
    Swahili, kipepeo
    Tohono O’odham, hohokimal
    Turkish, kelebek
    Yaqui, vaisevo’i

  14. nutrition supplements says

    Right on my man!

  15. Once again a piece of spam I can’t bring myself to delete, only to erase the URL; how can I deprive myself of “Right on my man!”?

  16. In Chinese – “hudie”
    In Russian except of “babochka” also can be “motyliok” =D

  17. Charles Perry says

    The people of Harer in SE Ethiopia have a charming word for butterfly. They speak a Semitic language related to the Amharic of the highlands, but unlike the Amharas, they are Muslims and use the Arabic alphabet. Their word for butterfly is amhara kitab “Amharic book,” because to them the Amharic alphabet resembles the markings on a butterfly’s wing.

  18. That is nice—thanks!

  19. Late to the party but I can add a term common to the Jívaro languages – wámpishuk
    Also I note the Maori word given is pulelehua, which seems pretty unlikely given that Maori has no /l/. Sure enough, my dictionary gives ‘purerehua’ (with the first /u/ long).

  20. michael farris says

    Mvskoke (Creek): tvffolopv, tvffolope (nb. v = [a] and e = [i], both short)
    grasshopper is tvffo so -lopv / -lope looks like it should mean something, (e)lope is “liver” but that doesn’t look promising. There’s a verb lopicetv (nb. i = [e(y)]) which looks like lop + causitive but it’s intransitive and means ‘to be nice, kind, well-behaved’.

  21. Kobelek in Kazakh
    Zimerfoigele in Yiddish
    Motyl in Yiddish

  22. Tatar should be күбәләк (approximately kybäläk), as is Bashkir.

  23. S. Valkemirer says

    The notions of Isaac E. Mozeson, the etymological maniac who believes that Hebrew was the original language of the human race, were shredded in 1990 and 1995:

    Gold, David L. 1990. “Fiction or Medieval Philology (on Isaac E. Mozeson’s The Word: The Dictionary That Reveals the Hebrew Source of English).” Jewish Linguistic Studies. Vol. 2. Pp. 105-133.

    Gold, David L. 1995. “When Religion Intrudes into Etymology (On The Word: The Dictionary That Reveals The Hebrew Source of English).” In Kachru and Kahane 1995:369-380.

    Kachru, Braj B., and Henry Kahane, eds. 1995. Cultures, Ideologies, and the Dictionary: Studies in Honor of Ladislav Zgusta [= Lexicographica: Series Maior, vol. 64]. Tübingen. Max Niemeyer Verlag.

    Speaking of etymological monomaniacs (an established linguistic term, not my coinage), one may also mention Daniel Cassidy (in a series of posts beginning in March 2019 [https://cassidyslangscam.wordpress.com/2019/03/], Danielomastix has debunked the notions in Cassidy’s book How the Irish Invented Slang) and Charles MacKay (whose Dictionary of Lowland Scotch was debunked in the nineteenth century, though it is still available in a reprint).

    One of the endnotes in Leonard Bloomfield’s Language (1933) mentions a Dutch etymological monomaniac and maybe also a Greek one.

  24. Naturally, there’s a post and a discussion about it.

  25. January First-of-May says

    a Dutch etymological monomaniac

    That must have been Goropius Becanus, the subject of the first ever Language Hat post.

    (Of course, since his “Dutch” was the Antwerp dialect, technically it should probably be classified as Flemish.)

    In Russian except of “babochka” also can be “motyliok” =D

    The former means “butterfly”, the latter means “moth” (and IIRC is cognate to it); those are subtly distinct concepts, though I suspect that many languages don’t have separate words for them.

  26. Trond Engen says

    The Antwerp dialect is Flemish in a modern socio-linguistic or political sense, but is it correct to characterize it as Flemish historically? The dialect of the upland of Antwerp is more or less the definition of Brabants. There was a cline from Brabants to Flemish, but mostly on the Flemish side of the Schelde, through East Flanders. The Antwerp city dialect could be something else, but I believe it skewed towards common Dutch features rather than specifically Flemish.

  27. David Marjanović says

    The former means “butterfly”, the latter means “moth” (and IIRC is cognate to it); those are subtly distinct concepts, though I suspect that many languages don’t have separate words for them.

    Moth covers all the nocturnal lepidopterans. German draws the line in a different place: Schmetterling covers the big ones, Motte the normal-insect-sized ones that eat your stuff when they’re larvae.

  28. Lars Mathiesen says

    Danish has sommerfugl (all of the lepidopterans, according to entomologists, but popularly only the big colourful ones), but also natsværmer (the big, drab nocturnal ones) and møl (the small ones that eat your stuff).

  29. I have no slightest idea if Russian draws lines anywhere and how many. For me they overlap: when a large moth spreads wings and turns out to be coloured, the moth is a butterfly, obviously. There is also common definition “night butterfly”.

    – Way of life: Large day time butterflies. Nocturnal ones. Small ones.
    – Camouflage strategy
    – Size.

  30. January First-of-May says

    For me they overlap: when a large moth spreads wings and turns out to be coloured, the moth is a butterfly, obviously.

    Well, duh. That’s kind of obvious. OTOH, IIRC, it’s usually obvious that it was a butterfly in the first place.

    I think the Russian classification is much like the Danish one (except for the entomologist part); in particular, the small ones that eat stuff are neither butterflies (бабочки) nor moths (мотыльки) – they’re a separate category (моли, singular моль, like the unit of measurement except feminine; this name is probably cognate to møl).

  31. бражник

    бра́жник • (brážnik) m anim (genitive бра́жника, nominative plural бра́жники, genitive plural бра́жников)

    1. (obsolete) reveler, drunkard
    2. butterfly of Sphingidae family


  32. Danish has sommerfugl

    No fjæriler at all?

  33. Lars Mathiesen says

    We lost that somewhere, I’m afraid. This article mentions a Norwegian dialect fivreld(e) and guesses that Danish had something similar, but no instances have come down. Sommerfugl seems to have been it since 1500 or so.

    Hellquist (1922) sv. fjäril only mentions (Modern) Icel fiðrildi/fifrildi, cognate to G falter (OHG fifaltra) and L papilio. (fifaltra vel sim > fifrildi by metathesis > fiðrildi by contamination from the ending > the modern form, is the guess).

  34. To me, the proper phylogenetic distinction between butterflies and moths is the most salient. I think this is because when I was five, at the experimental preschool I attended (which had just moved from the church where it was founded to a 1920s mansion they got for cheap), one of the teachers found a moth cocoon in the yard and brought it inside in a glass-sided box, so we could see the see the moth up close after it emerged. The fibrous cocoon and then the beautifully intricate comb-shaped antennae made a deep impression on me, so those are really what I think of in connection with moths.

  35. Danmarks Dagsommerfugle på nettet


  36. Lars Mathiesen says

    @Brett, so what is the proper distinction? Moths are Lepidoptera minus Rhodalocera/Papilionoidae? (I only know what WP tells me, though, and the clades seem to be works in progress).

    And yes, dagsommerfugl is entomologist for Rhodalocera = butterfly.

  37. David Marjanović says

    German has the technical terms Tagfalter and Nachtfalter; I think Nachtschwärmer is a subset of that (apart from being used for people who don’t go home at night). Falter alone is also technical. It’s from falten, “fold”.

    (…But “folder” is Ordner, one who makes order, or Mappe, especially the ringless ones.)

  38. fifaltra vel sim > fifrildi by metathesis > fiðrildi

    Moroccan Arabic fartūt Tunisian/Algerian Arabic/Berber ferṭeṭṭu* Malta farfett Italian farfalla

    *with variants.

  39. @Lars Mathiesen: I guess what I mean is that I consider true butterflies and true moths to be highly derived groups. Butterflies would be Rhopalocera, and moths the rest of Obtectomera. There are a lot of lepidopterans that I wouldn’t look at and think of as either moths or butterflies, really. They might as well be caddisflies to me.

    However, upon thinking about it, I realize that my mental categorizations really only applies to the imago stage (and even with imagoes, I would probably be apt to make many errors in practice). I am not adept enough to identify eggs or larvae except for a few very distinctive types (like monarch caterpillars). However, I would probably identify a much broader collection of pupa types as moths or butterflies than I would the adult forms.

  40. Lars Mathiesen says

    @DM, DWDS has Falter < fifaltra with the same reduplication as L papilio but cognate with Flattern, ultimately < *pel-. DWDS also has falten ultimately < (another) *pel-, with different senses. Are they wrong?

  41. David Marjanović says

    I have no idea, so they’re probably right and I’ve probably just repeated a reanalysis. I can only confirm that flattern “to flap” exists (and is in common use).

    BTW, that must have been the last vestige of reduplication in OHG.

  42. flattern “to flap”

    In some contexts – butterflies and flags – “flutter” is another word that is used in English. With Flattern of car wheels it would be “wobble”, I think.

  43. There is also флаттер in aviation:

    German Wiki: Flattern_(Luftfahrt),
    from more general:
    English Wiki: Aeroelasticity#Flutter

    I intended to post these two links without comments but the spam fliter won’t let me.

  44. “Flutter” as a deformation instability of an object in a fluid flow is familiar to me.

  45. Lars Mathiesen says

    Audio tape systems had wow and flutter (cyclical playback speed variations below and above 4Hz). Digital systems have jitter instead.

  46. Stu Clayton says

    wow and flutter

    I heard these terms decades ago, but never knew exactly what was meant. From a few minutes of intersearching, I get the impression that German doesn’t have an equivalent fixed pair of contrasting words. The analog audio and video phenomena are both covered by Gleichlaufschwankung(en). For audio the effects are variously called Leiern, Jaulen, Wimmern. None of these is what I think this “wow” sounds like (Jaulen comes closest mebbe), so it appears I don’t know what this “wow” sounds like. I call on the older generation for enlightenment !

    # Gleichlaufstörungen werden ab 0,2…0,3 % vom durchschnittlichen Gehör bemerkt und führen ab einer bestimmten Stärke zu hörbaren Tonhöhenschwankungen, die als “Leiern”, “Jaulen” oder “Wimmern” wahrgenommen werden. #

  47. Re Falter / falten, there is this old piece of business humour (pun works only in German):
    Wer glaubt, dass ein Projektleiter Projekte leitet, der glaubt auch, dass ein Zitronenfalter Zitronen faltet. “People who believe that a project manager manages projects, also believe that Gonepteryx rhamni*) folds lemons.”
    *) Posting this was worth it just for learning that this butterfly is called “(Common) brimstone” in English. So some preachers preach fire and butterflies. 😉

  48. I thought I had coined “heckfire and darnation” in this connection, but it has a small but significant number of ghits.

  49. >(Common) brimstone

    Not to be confused with Colias philodice, the common sulphur. The connection between hell and soft yellow butterflies was apparently strong at some point.

    We could use a few Gonepteryx rhamni here, to nibble back the Rhamnus cathartica, among the worst of invasive species in the Midwest. Although on second thought, there was that woman who swallowed the spider to catch the fly …

    Also, there should be a term ghats for the number of times a word has been used in this forum.

  50. Made me chuckle!

  51. David Marjanović says

    Flutter – oh yes, I only had the flapping-as-opposed-to-gliding flight of birds on my mind.

    Jaulen I’d translate as “whelp” out of context.

  52. David Eddyshaw says
  53. If you’re modding more than eight,
    You’re gonna get wow on your top.
    You try to bring that down through your rumble filter to your woofer:
    What’ll you get?
    Flutter on your bottom!

  54. I did not explore https://clics.clld.org before. It is quite convenient:

    pilapila – Kazukuru (Austronesian)
    pili pala – Welsh (IE)
    pilipili – Nhirrpi (Pama-Nyungan)

  55. Stu Clayton says

    Jaulen I’d translate as “whelp” out of context.

    “Howl”. “Whelp” is werfen.

    Heidegger wrote scads about people being geworfen into the world.

  56. David Eddyshaw says

    Kusaal for “butterfly” is pisiŋpiʋŋ, which fits the reduplicative pattern at any rate.

    That’s actually a feature of quite a number of “insect” words in Kusaal, though, e.g. silinsiung “spider”, vulinvuunl “mason wasp”, nɛsinnɛog “centipede”, so it doesn’t seem to be quite the same kind of reduplication as the “butterfly” words in so many other languages. Nothing to do with flapping …

    The Japanese chōchō (as in “doomed lover of Pinkerton”) goes back to a reduplicated Chinese form something like *lep-lep, too.


    EDIT: Yes, we’ve done this before:


    Butterflies are phonaesthetic! (or something)

  57. Stu Clayton says

    “phonaesthetic convergence” is the dubious notion summoned there.

  58. Their word for butterfly is amhara kitab “Amharic book,” because to them the Amharic alphabet resembles the markings on a butterfly’s wing.

    Together with “the Hausa word is ,malam-bude-littafi (first and last a‘s long),” it’s the second African kitab butterfly already.

  59. ktschwarz says

    CLICS lists 1,391 entries for BUTTERFLY (“A lepidopteran that is active at day”), none of which are colexified with anything else — so why is it in the colexification database in the first place? Odd.

    It also lists 34 entries for MOTH (“Group of insects related to butterflies”), all in northern India, and none colexified with anything either. Looks like somebody just entered words glossed “moth” from a couple of datasets, and there’s nobody in charge of checking the rest of the datasets for moth words. Oh well, 2.4% of a loaf is better than none.

    I wonder how many of those 1,391 BUTTERFLY words are actually broader than the English word, or narrower. Above, January FoM, Lars, and David M already mentioned that English “moth” doesn’t match up exactly to words in Russian, Danish, or German.

  60. David Eddyshaw says

    The Hausa literally means “Mister Open-the-Book” (should be malam buɗe littafi.)

    GT gives malam buɗe ido “Mister Open-the-Eye”, which seems rather to miss the metaphor. Or maybe not: I suppose blinking is another way of looking at it … *

    I suspect that the same metaphor underlies the Amharic too, rather than “markings on a butterfly’s wing.”

    * Now I think of it, Kusaal uses the same verb (lak) for “open a book” and “open the eye.” So it may well be an areal thing. Lak is etymologically “un-stick-together.” AFAIK Hausa buɗe itself has much the same range as English “open”, but in Kusaal you don’t lak a door, for example: you yɔ’ɔg it, using the reversive derivative of “close (e.g a door.)”

  61. “Howl”. “Whelp” is werfen.

    DM was probably thinking of “yelp” rather than “whelp”. It seems like “yowl” would be an appropriate translation for jaulen too.

  62. Meanwhile the original article by William O. Beeman, 2000 contains a valuable early attestation of Lameen Souag…

  63. ktschwarz says

    in Kusaal you don’t lak a door, for example: you yɔ’ɔg it, using the reversive derivative of yɔ “close (e.g a door.)”

    Wiktionary says the Welsh verb agor ‘to open’ is similarly formed from a negative prefix plus a verb descended from PIE *ǵʰer- ‘to enclose’ (ancestor of yard and garden). Do they have that right? They cite Morris Jones from 1913.

  64. David Eddyshaw says

    Do they have that right?

    In a word, no.

    Morris-Jones had a notoriously imaginative approach to Welsh etymology …

    The a- cannot possibly be a regular reflex of PIE *n̥- (which gives Welsh an/am/ang, cf cant “hundred”; the real reflex of PIE *n̥- remains as productive as the cognate English “un-“, e.g. amhosibl “impossible.”)

    The attribution of the -gor to PIE *ǵʰer is likewise impossible. If the prefix ended in a vowel, the initial should have been lenited (in which case the /g/ of agor would instead go back to PIE *k); if it ended in a nasal, the initial would have been nasalised (duh!) which it plainly wasn’t.

    GPC attributes agor to a root *kor, which turns up, for example, in esgor “giving birth, emergence.”

  65. disclose the door and unfold your heart…

  66. David Eddyshaw says

    similarly formed from a negative prefix

    Strictly speaking the suffix in question in WOV, *g, is reversive rather than negative.

    WOV reversive *g is actually a bit of mystery (although reversive suffixes as such are a familiar pan-Niger-Congo thing.) Every other branch of Oti-Volta, even the closely related Buli-Konni, has a suffix *d/t in this role instead.

    Bantu actually has both *k and *t as reversive suffixes, depending on transitivity, so conceivably WOV just happened to select a different one of the pair from all the rest of Oti-Volta, but I must say that seems suspiciously convenient.

    It could be made less implausible by pointing out that the *t/d variant would collide with the *t/d which WOV, alone in Oti-Volta, has generalised as an imperfective-aspect suffix; but you would still need to explain away the absence of *k/g reversives everywhere else in the family somehow.

    I’d like to connect the phenomenon with another, phonological, distinctive thing about WOV, viz that it has lost the palatal stop series, but the data just aren’t there to support it. The only WOV language which hasn’t lost the palatals, Boulba, is so poorly documented that I don’t know what the form of the reversive suffix is there.

  67. “the Amharic too”

    @DE, the original comment referred to Muslim people of Harer, presumably using Arabic script (and speaking Harare). Presumably, markings explain why the kitab is “Amharic”…

    malam buɗe ido – thanks, I did not see it in dictionaries. It does make sense, though google images display tattoos on white backs, and in google there are results from sites like yoo.rs (.rs is Serbia..) translated from Dutch to Hausa… :/ But Hausa is likely to have multiple words for b., so despite the strange distribution, machine translators could have learned it from an actual Hausa sourse. It would make even more sense if they have peacock butterflies (the European one is Aglais io, it is ubiquotous and is called павлиний глаз “peacock eye” in Russian. As a child I also knew капустница “cabbageress” and лимонница “lemoness” called so for their colour).

    malam buɗa mana littafi, malam batata says Newman. ( “mālàm bātātā m [d.v.]” , dialectal variant, “mālàm bū̀ɗā manà littāfī̀ m Butterfly (= mālàm bū̀ɗe littāfī̀)”)

  68. Trond Engen says

    Hausa malam bude littafi “Mister Open-the-Book” looks like a contamination of malam buɗe ido “Mister Open-the-Eye” with Eng. butterfly.

  69. Another butterfly list (2658 disordered entries):

  70. Stu Clayton says

    DM was probably thinking of “yelp” rather than “whelp”.

    I merely corrected an incorrect translation. I do not speculate about what anyone thinks. In general, I try to avoid such unsolicited extravagance, instead preferring to deal with the words excreted.

  71. David Eddyshaw says

    most languages, even closely related ones, have entirely different words for ‘butterfly’

    The Western Oti-Volta languages come out as practically all the same language on the basis of Swadesh 100 comparisons, which, to be honest, tells you more about the limitations of Swadesh-list comparison than about the WOV languages. Still, they are undoubtedly all closely related, and they do illustrate the point:

    The Mampruli word for “butterfly” is pipibga, which cannot be cognate with the Kusaal pisiŋpiʋŋ, but the Dagbani word is pahimpiɛɣu, which, matches the Kusaal pretty well (Dagbani intervocalic -h- comes from *s, and see on the ɣ versus ŋ correspondence below.) However, Mampruli and Dagbani are much closer to each other in general than either is to Kusaal, to the point of almost being dialects of one language; it’s Mampruli which is geographically closer to Kusaal.

    Mooré has pilimpiuku, which is the same kind of formation as the Kusaal and Dagbani but with a different reduplication prefix (the same one as in Kusaal silinsiung “spider.”) The -k- actually works for the stem to be cognate with that of the Kusaal pisiŋpiʋŋ: the noun class suffix in both cases is -gʊ-, and the Mooré form just shows the expected sandhi change *gg -> k, reflecting a stem *piig-, but Kusaal has a peculiar rule *gg -> ŋ after long vowels. In the Dagbani pahimpiɛɣu, the ɣ /g/ would be the expected outcome of *gg too: Dagbani is peculiar in not devoicing *bb *dd *gg like most of WOV but just simplifying them to b d g.

    So you can probably set up a Proto-WOV stem *piig- “butterfly”, though it won’t give you all the current forms without further ado, because the prefix strategies adopted by different languages vary.

    Farefare has kõmpilgo, just to be awkward.
    The Ali-Grimm-Bodomo Dagaare dictionary has no word for “butterfly”, alas …

  72. David Eddyshaw says

    I may have spoken too soon in saying that Mampruli pipibga could not be cognate with Kusaal pisiŋpiʋŋ; Mampruli seems to have its own peculiar strategy for dealing with *gg following a long stem vowel:

    The word for “false kapok” (Bombax buonopozense) in Mooré is voaaka, plural voogse, i.e. it has the stem voog-. This can be confirmed as the original stem by non-WOV cognates, e.g. Waama fɔkibu, Nawdm voogb (the -b(u) is the Oti-Volta “tree” noun-class sg suffix; in all of WOV except Boulba, trees have got transferred en masse to the ga/sɪ noun class instead.)

    Kusaal has done its *gg -> ŋ thing: vuoŋ, and then confused the issue by inventing a new plural by analogy with m-stems: vuomis. But Mampruli has voobga, plural voobsi

    However, the Mampruli for “butterfly” is pipibga, not *pipiibga

    Be that as it may, I seem to have demonstrated the opposite of what I said I was going to; most WOV words for “butterfly” probably are related to one another. At any rate, they are not “entirely different.”

  73. J.W. Brewer says

    @Stu: How a lot of Americans of my generation and a bit older were introduced to the Heidegerrian notion of Geworfenheit while listening to the radio: https://en.wikipedia.org/wiki/Riders_on_the_Storm#Heidegger's_influence

  74. Stu Clayton says

    @JW: Only the notion of “being thrown” is chewed over there, even in the Critchley piece. But animals are not thrown into the world. I find all those commentaries pretty underwhelping.

    I don’t know what to make of this:

    # The connection between the thrownness into the world and a dog’s life was anticipated by the anti-Heideggerian author Ernst Bloch in his main work The Principle of Hope (1954–9).[24][25] #

    Heidegger does not “connect” Geworfenheit with “a dog’s life”. He explicitly excludes non-human animals from the delights of Dasein. Being German himself, Bloch is quite naturally aware of Heidegger’s puns. But he could not anticipate them, since Das Prinzip Hoffnung was published over 20 years after Sein und Zeit (1927).

    At any rate, nowadays many dogs live the life of Riley, of whom H. would have disapproved. I could only speculate what Jim Morrison might have thought about dogs, but I don’t expect it would cause his fame to incur blame.

  75. J.W. Brewer says

    @Stu: I’m thinking the writer meant that Bloch “anticipated” Morrison’s lyrics, not that he anticipated Heidegger …

  76. Stu Clayton says

    Oh. But how clever of Bloch ! And yet his works are still a trial to read. He should have formed a band.

  77. Stu Clayton says

    @JW: My harmless reply has just gone into moderation [and a few minutes later resurfaced, as immoderate as before].

    If this is all about Morrison and dogs, why have these journalists dragged in Heididdlediddle and Blockhead ? But thanks for that blast from the past.

  78. ktschwarz says

    DE: agor: Thanks, another illusion shattered!

    reversive rather than negative: Yeah, I was fudging it as an excuse to ask about agor. Thanks for the elucidation.

  79. reduplication prefix
    @DE, is it K. M. -lin-/-lim- and K. D. -siŋ-/him? Does the stem jump over the prefix: CV-pref.-CVVC?

  80. David Eddyshaw says

    Cheating by cutting-and-pasting from my grammar, with much pruning, and transposed into the standard orthography:

    Prefixes precede many nominal-stem roots. Most have no identifiable meaning, though prefixes are commoner in certain semantic fields (e.g. insects.)

    Most prefixes are CV(n) CVsin or CVlin, where V is a or i; n adopts the position of root-initial C. Ci -> Cʋ before rounded root vowels unless C is t or s, and before all back root vowels when C is labial or labiovelar.

    CVsin/CVlin prefixes copy root-initial C:

    silinsiung “spider”
    vʋlinvuunl “mason wasp”
    zilinziog “unknown”
    tasintal “palm of hand”
    wasinwal “gall” (on tree)
    kpisiŋkpil “fist”
    nɛsinnɛog “centipede”

    Ci(n) copies root-initial C; with voiced obstruents, only Cin occurs:

    kikaŋ “fig tree”
    kʋkɔr “voice”
    kpʋkparig “palm tree”
    kpikpin “merchant”
    tita’ar “big”
    pipirig “desert”
    sisi’em “wind”
    fʋfʋm “envy; stye”
    lilaaliŋ “swallow”
    mimiilim “sweetness”

    kiŋkaŋ “fig”
    tintɔnrig “mole” (animal)
    (nɔb)pʋmpauŋ “foot”
    sinsaan “kind of tiny ant”
    dindɛog “chameleon”
    dunduug “cobra”
    bimbim “altar”
    bʋmbarig “ant”
    gʋŋgʋm “kapok material”
    zɩnzauŋ “bat”
    zʋnzɔŋ “blind”

    [Continued p94. There are other prefixes as well which don’t involve reduplication.]

  81. @DE, thank you! “; n adopts the position of root-initial C.” what do you mean here?

    Meanwhile the messy (it has words for the swimming style and for flirting*) panlinx list has:

    Hausa balebale – bude littafi – buɗe littafi – littafin Allah – malam batata – malam bibi – malam bu’de littafi – malam buɗe littafi – malam-buɗa-littafi – malan didi – malan wutsiwutsi – mallam-buɗe-littafi – máalàm bùuɗè líttáafíi
    Pulaar lilldeh – mbéduː alla
    Fulfulde bedelallah – bidilallah – lilldeh – palapala – pucharlar

    *I do not understand why flirting.

  82. David Eddyshaw says

    Position of articulation: thus mb ŋk and so on.

  83. @DE, aha, thank you. It occured to me, but then I saw vʋlinvuunl and decided to ask.

  84. David Eddyshaw says

    Yes, good catch: vʋlinvuunl is actually [vuliɱvũ:l].

    In fact, thank you, drasvi; copying that out made me realise that I’d missed a trick in my description. The actual standard orthography for that word is vulinvuunl, not vʋlinvuunl; I wrote it with ʋ in the orthography of my own grammar because prefixes actually don’t have ATR distinctions but just align with the root vowel (and non-root vowels in general just have a basic three-way /a ɪ ʊ/ distinction.) However, that entails explaining specifically in the phonology section that written ʋ is realised [u] in that context rather than [ʊ], and you can do the job much more straightforwardly by saying that reduplication-prefixes copy the initial CV of the root, and not just C (you can actually see this principle at work in the examples I gave of CVsin/CVlin prefixes.) I just realised that you can actually pull a similar trick with CV and CVn prefixes, just adding a rule about how low vowels get copied as high. I do like simplification without loss of any descriptive power …

  85. David Eddyshaw says

    As you’re collecting, drasvi: the Kasem word for “butterfly” is campulu.

    It’s a bit reminiscent of Farefare kõmpilgo (where -go is a class suffix), I suppose. The languages are neighbours, though by no means closely related. Kasem is a Gurunsi language; for reasons that are quite beyond me, the group seems to be named after the self-designation (Gurensi) of the Farefare, whose language is not a Gurunsi language …

    The Moba word is pinpilunŋ, which fits nicely with the Proto-Worldish pilipala etc at any rate; Nawdm, as often, goes its own sweet way, with kpaŋkpaanga. Waama has caŋampɛtɛma. No idea …

  86. David Eddyshaw says

    … and the Mbèlimè word is penpiehṵ (-hṵ being a class suffix), which looks straightforwardly cognate with the WOV *piig- forms (though with yet another prefix type.) The “false kapok” word in Mbèlimè is fɔɔbu (from *voogbu), with a similar lenition of *g between a long vowel and a consonant; likewise Mbèlimè naafɛ “cow”, plural naahḭ, cognate with Kusaal naaf *naagfʊ, plural niigi.

  87. Just for comparison purposes:

    Finnish sandhi is extremely frequent, appearing between many words and morphemes, in formal standard language and in everyday spoken language. In most registers, it is never written down; only dialectal transcriptions preserve it, the rest settling for a morphemic notation. There are two processes.


  88. David Marjanović says

    DM was probably thinking of “yelp” rather than “whelp”.

    Yes, but to the extent that I had actually learned whelp wrong.

  89. David Eddyshaw says

    Finnish sandhi is extremely frequent

    WOV has gemination of /b d g/ after short root vowels, though the patterns have been much altered by levelling within paradigms, and in at least some cases the historical origin of the process was really assimilation of a lost consonant following the root vowel to a following suffix-initial consonant.

    Internal sandhi is really the only thing that makes Kusaal flexional morphology at all complicated: the underlying flexional system of nominals and verbs is pretty simple (especially verbs, where there has been a lot of levelling of a system which was straightforward enough to start with.)

    On the other hand, internal sandhi is quite good at making things more difficult; all of these words are perfectly regular and belong to the same (ga/sɪ) noun class:

    bʋʋg “goat”, plural bʋʋs
    baa “dog”, plural baas
    sabua “girlfriend” plural sabuos
    nua “hen”, plural nɔɔs
    da’a “market”, plural da’as
    zak “compound”, plural za’as
    kʋk “chair”, plural kʋgʋs
    bʋŋ “donkey”, plural bʋmis
    tɛŋ “land”, plural tɛɛns

  90. Areal patterns and colexifications of colour terms in the languages of Africa by Segerer and Vanhove: https://halshs.archives-ouvertes.fr/halshs-03483348/ It (e.g. the table on p 22) illustrates what one wants to be able to see in DBs like Clics and RefLex. That is, click somewhere and make it show you a map of regions where langauge is tongue and regions where language is mouth.

  91. Must be not difficult to implement though.

    A script that draws a map for you, and a langauge letting you formulate requests in the form: “show me langauges which…”. A langauge that can formulate isoglosses.

  92. David Eddyshaw says

    In Kusaal, “language” is neither “tongue” nor “mouth”: it’s pian’ad, literally “words.”

    Kusaal nɔɔr “mouth”, apart from its literal sense, means “command”, as in nɔdi’es “command-receiver”, which is the Kusaal version of the Akan okyeame “linguist”:


    Linguists are important people in Ghana (a kind of minor nobility, in fact.) We have much to learn from these ancient cultures …

    Incidentally, the stem of nɔɔr “mouth” is reconstructable all the way back to Proto-Volta-Congo (cf Proto-Bantu -nùà.)

  93. So about how many roots are reconstructable for Proto-Volta-Congo?

  94. @DE, this is somethign I do not understand about poeples unaffected by the modern-type culture (because I never dealt with such people).

    Do they have a concept of word?

    If anyone can recommend some reading about this, i would be absolutely grateful.

  95. David Eddyshaw says

    So about how many roots are reconstructable for Proto-Volta-Congo?

    Not a huge number (so far), in my opinion, but I’m more into rigour than some. I will also have missed some that happen to be absent in Oti-Volta, or indeed happen to be absent in Western Oti-Volta specifically, because of where I’m starting from.

    Off the top of my head:

    All the personal pronouns except 2nd sg.
    Numbers “two”, “three”, “four”, and possibly “five” (but this is a bit iffy: numbers are very borrowable, and even Manding shares these, despite pretty certainly not being related to Volta-Congo.)
    The verbs “eat”, “drink”, “bite”, “work/send” (a consistent overlap; it occurs in Chadic too), “die”, possibly “sit” …
    The nouns “tree”, “mouth”, “child”, “bone”, “ear”, “dog”, “goat”, possibly “cow” (but that seems to be confined to West Africa) … probably “tooth”, and possibly “water” and “eye” …

    Others will occur to me as soon as the edit window closes …

    The noun class systems of Oti-Volta and Bantu show some definite cognates among the class affixes, but really only four or so of the ten or so “genders” found in typical languages look definitely related; there seems to have been a lot of innovation on both sides.

    The verb conjugation system defies reconstruction, but then Manessy basically gave up even on reconstructing the system for Proto-Oti-Volta alone (I’m more hopeful, but a lot of work needs to be done yet.)

  96. David Eddyshaw says

    Do they have a concept of word?

    Although nowadays piaunk (the singular of pian’ad) would be equated with English “word” by bilinguals, I think that fundamentally it just means “utterance, instance of speech.” It’s a (somewhat irregular) derivative of the verb pian’ “speak; praise.” It certainly doesn’t in itself have any of the fairly technical senses associated with English “word” (and with literacy, of course.)

    As it happens, word division in Kusaal is a real can of worms, because of (a) the effects of deletion of final short vowels in most contexts, which has left some (perfectly real) words with no segmental form at all, and others as just single consonants with no vowels of their own; and (b) the characteristic Oti-Volta pervasive use of noun compounding where lesser languages use phrases composed of free words. The word division conventions in the standard orthography are reasonably consistent in themselves but don’t actually reflect the real structure of the language very well at all.

  97. John Cowan says

    I had actually learned whelp wrong

    Wikt says the German equivalent is Welpe < 18C Low German (displacing native Welf), and gives this example: “Nun, eines guten Tages bringt die Hündin sechs oder acht Welpen auf die Welt”, from a 1978 translation of Halldór Laxness’s memoir Í túninu heima. We are also told that whelp applies not only to young canids (where it has mostly been displaced by puppy < Fr. poupée), but also to ursid, felid (excluding domestic cats) and pinniped offspring. The North Germanic forms are variously (west to east) hvolpur, hvølpur, kvelp, hvalp, valp, valp; note that despite the archaizing spellings of the Icelandic and Faroese words, they are pronounced with /kv-/, as you’d expect for West Scandinavian. The Elfdalian form is wep.

  98. David Eddyshaw says

    Others will occur to me as soon as the edit window closes …

    “Arm”, certainly.
    Possibly “meat”, but that one seems to be a real Wanderwort (even turns up in Hausa …)
    Probably “knee”, but the Oti-Volta root has a final m that doesn’t turn up in Bantu. Tones match, though, which always helps the plausibility along.
    “Tongue.” How could I forget “tongue”? (Of course, if I was right in suggesting it was phonaesthetic, it Doesn’t Count …)
    “Cook” (in Meeussen’s reconstructions of Proto-Bantu, though I can’t trace where he got it from. It doesn’t seem to be common in yer actual Bantu languages.)

  99. ktschwarz says

    piaunk … just means “utterance, instance of speech.”

    Which is also the oldest sense of “word” in English. Or at least, the first sense listed in the OED; a lot of the senses go back to Old English, so I don’t know if they can really be ordered.

    The concept of “word” in a pre-literate society is a plot point in Ted Chiang’s “The Truth of Fact, the Truth of Feeling” (a story that Chiang says was inspired by reading Walter Ong). The missionary admits he doesn’t know how to explain what a word is, but the student eventually forms his own understanding: “The sounds a person made while speaking were as smooth and unbroken as the hide of a goat’s leg, but the words were like the bones underneath the meat, and the space between them was the joint where you’d cut if you wanted to separate it into pieces.” I’ve been wondering what Hat thought of this story.

  100. David Eddyshaw says

    Others will occur to me as soon as the edit window closes …

    “Person”, probably, though it doesn’t seem to have made it to Bantu, and might just be Gur-Adamawa/Savannas/Whatever.

    It would help if I knew more about Benue-Congo apart from Bantu.

    The Fulfulde word does look pretty similar, an awkward fact I propose to ignore on account of being a card-carrying Atlantic-sceptic. Obviously a mere coincidence.

  101. @DE, one metaphor that I miss in English is that of a spherical horse in vacuum (cf. Spherical cow – I think in Russian we consider them more often…). So speaking about a spherical oral (or just different even if written) culture, at least there can be a concept of a possible answer to “how do you call this? What’s the word for this?”.

    Also a reasonable expectation would be that a different model of language can be different from ours but not necessarily “less precise/detaled”. Conversely, when a model makes distinctions that we do not notice, it is particularly interesting for us.

    Also to know that people do not parse anything in chunks smaller than “an instance of speaking” I need first to speak to different peoples… Else I am like those 19th century professors who, when proving that aborigines can’t count, would dismiss reports that mention “too large” aboriginal numbers and even refer to aborigines doing multiplication as shameless pseudo-science.

    After all, phonological words, syntactical words etc. exist and are meaningful for linguists.

  102. David Eddyshaw says

    What’s the word for this?

    Even the Latin verbum can mean “saying, sentence” rather than “(single) word”; I would guess that a “single word” meaning in just about any language derives ultimately from the sort of linguistic analysis that inevitably has to occur whenever you decide to write a language down in some way. In fact, mismatches between “linguistic” and “lay” concepts of “word” often go back to peculiarities of the writing system (think of Chinese, for example.)

    There are languages out there in which it’s actually ungrammatical to use (say) a noun by itself as a complete utterance: in reply to “What do you call this in X-ian?” you’d need to say the equivalent of “It’s Y.” There are even more languages in which citing a finite verb form by itself is simply ungrammatical (Mooré is one.) Even in many more familiar languages, it’s an abnormal, marked kind of discourse to do that. The sort of thing they have to teach in school …

    My own feeling is that the Construction Grammar people are basically right on a philosophical level (I’m not too sold on the viability of Construction Grammar as a practical tool, but that’s a different matter.)


    It’s “constructions all the way down”; they are the real building blocks of language. “Word” in the familiar modern English sense arises by abstraction across constructions, and the abstraction sometimes leads to counterintuitive results, as witness the perplexity perfectly competent speakers often experience when confronted with the apparently simple question: “Yes, but what does [this word] actually mean?” In fact the more familiar and common the “word”, the harder the question actually is to answer (because it occurs in a greater range of constructions.)

    phonological words, syntactical words etc. exist and are meaningful for linguists

    Meaningful, sure, but very problematic. There’s a big linguistic literature about this. No neat definitions ever seem to work.

    a different model of language can be different from ours but not necessarily “less precise/detailed”.

    You betcha. No two languages divide up the world the same way. (This is not Sapir-Whorfism: it doesn’t at all mean that you can’t find effectively equivalent ways of saying the same thing, with enough effort.)

  103. “Yes, but what does [this word] actually mean?”

    What does the word “information” mean?

  104. David Eddyshaw says


    The reason lexicography is difficult is because it’s a fundamentally unnatural activity (at least, if you’re doing it right.)

  105. PlasticPaddy says

    @dm 15.05:8.28
    I would guess that you had not forgot Welpe 😚. The use of whelp as a verb in English is untypical (there is also calve, but nothing analogous for other nouns for newborn or young animals that differ markedly from the adult noun, except maybe horses also foal…). So my immediate guess was that you have analysed whelp as whine + yelp.

  106. PlasticPaddy says

    Actually small newborn animal > give birth to small…is more typlcal in English, i.e., also lamb. I suppose these verbs seem a bit exotic or unnecessary (i.e., why is it necessary to specify the type of offspring; I think German uses a generic “werfen”?) to me

  107. Now I can translate Russian “neither moos nor calves” to English!

  108. The use of whelp as a verb in English is untypical (there is also calve, but nothing analogous for other nouns for newborn or young animals that differ markedly from the adult noun


    In Kusaal, “language” is neither “tongue” nor “mouth”: it’s pian’ad, literally “words.”



    Compound of 言 (koto, “word”) +‎ 端 (ha, “edge; beginning, end”).[1] The ha changes to ba as an instance of rendaku (連濁).

    The term 言 (koto) was formerly the main term for word in Japanese. This term is also cognate with 事 (koto, “thing, fact, event”). The ha may have been added to differentiate from 事 (koto).

    The 葉 spelling is an example of ateji (当て字).

    (Tokyo) ことば​ [kòtóbáꜜ] (Odaka – [3])[2]
    IPA(key): [ko̞to̞ba̠]


    言葉 • (kotoba)

    1. a word, a term

    2. language, speech


    Welcome to the JLect Japonic Language and Dialect Database and Dictionary. Use the search bar above to look up a term in the various languages and dialects of Japan. From Okinawa, to Kansai, to Hokkaido: discover a world of words, meanings, and etymologies.

  109. John Cowan says

    In Kusaal, “language” is neither “tongue” nor “mouth”: it’s pian’ad, literally “words.”

    So the Kusaasi believe in the big-bag-of-words theory, then?

  110. David Eddyshaw says

    Piaunk is really “utterance.”

    So they believe in parole rather than langue, and quite right too. Langue is imaginary. (Sorry, Noam. But you know it’s true …)

  111. ktschwarz says

    DE: I would guess that a “single word” meaning in just about any language derives ultimately from the sort of linguistic analysis that inevitably has to occur whenever you decide to write a language down in some way.

    How inevitable is it? Alphabetic writing doesn’t necessarily require any marking of word divisions; Latin-alphabet languages took centuries to get around to putting spaces between words, and Thai and Burmese still don’t. So I don’t see how the alphabet itself produced the concept of words.

    And (per Bathrobe’s comment on that post) different writing systems led to different analyses, as in Chinese, where zì doesn’t correspond to our “word”, and they didn’t have a word for our “word” until Western linguists told them they needed one.

    On the other hand, before there is writing there is poetry, and poetry is organized into lines, which (it seems to me) implies a concept of some unit of language that can’t break across lines.

  112. David Eddyshaw says

    Fair point about spaces, but I did think about alphabetic writing: even there, the effort of writing leads you to frequently pause unnaturally in a way that you wouldn’t do in normal speech. It breaks the flow.

    And you end up, for example, not making external sandhi changes (unless you’re an ancient Indian linguistic genius who has finally decided to write down all the long Sanskrit sutras he’s got memorised.)

    I certainly wouldn’t claim that out technical concepts of “wordhood” are completely alien to all natural preliterate ways of thinking about language, though. And some languages lend themselves to being diced up into “words” more readily than others, depending on things like the degree to which (in linguistic terms) “phonological words” happen to coincide with “syntactic words” in the language in question (which varies a lot, as all Hatters know.)

  113. Stu Clayton says

    I can imagine that contradiction stood midwife at the birth of words.

    A says something, B says it ain’t so. Either A knocks B over the head, or tries to find out where the resistance is coming from. A thinks more closely about what she said, tries leaving out parts until she finds a residue that B accepts. What she leaves out is dubbed a phrase or word – the smallest unit of meaning until (much later) morphemes were invented.

    Words are the parts of speech that can be replaced in order to conquer resistance, or offer it.

  114. David Marjanović says

    why is it necessary to specify the type of offspring; I think German uses a generic “werfen”?

    No idea if they’re in actual use by farmers, but kalben (though mostly done by glaciers nowadays) and lammen exist. I think that’s the complete list, though; I’m pretty sure I knew the noun whelp, but wouldn’t have guessed that the verb meant “giving birth to puppies”.

  115. Stu Clayton says

    Einmal gebockt ist nicht gelammt. As I have had occasion to remark here before now.

  116. Lars Mathiesen says

    Søer still farer. Cp farrow and Ferkel. Also kælve (including gletsjere), fole and læmme. I found a posting on a Danish peever site and those seem to be the ones there are, other domestic animals får/føder unger without unduly stirring their humours. (Or kid, killinger, hvalpe when there is a specific (ha!) word).

  117. @DM: Lars’s comment reminds me that ferkeln also exists (“to farrow”), although most speakers rather use it in its figurative meaning of “making a mess, making things dirty”.

  118. cub


    cub (third-person singular simple present cubs, present participle cubbing, simple past and past participle cubbed)

    1. To give birth to cubs
    2. To hunt fox cubs
    3. (obsolete) To shut up or confine.

  119. From The Moral Statistics of Glasgow in 1863, Practically Applied, by A Sabbath School Teacher (Porteous & Hislop, 1864), p. 299:

    When the tigress cubs a lamb, when the vulture breeds a dove, then we’ll dream that man can love, his brother man of every clime.

  120. Trond Engen says

    Norw. (h)valpe, lamme, kalve, følle “give birth” (of dogs, sheep, cows, horses resp.). I’m not aware of a specific word for having piglets or (goat) kids. I’ve met felle “give birth” (of animals in general) but have no idea if it’s widespread.

  121. Same in Russian, but for animals whose cubs/kids do not ahve a dedicated name we have generalized the root кот- “cat”…

  122. Thoughts on the Universal Dependencies proposal for Japanese
    The problem of the word as a linguistic unit


  123. Stu Clayton says

    Link bug.

  124. Link bugs and butter flies keep company on the breakfast plate.

    Anyway this is what juha was aiming at.

  125. I had in my “PYRALIDID” entry (appendix A) the PR Greek, the PPL Latin, the Malay PPL and the Nahuatl PPL terms for butterfly. All should be influenced by Hebrew PaR PaR

    This page made me remember the word PYRALIDID.

    WP says: “The Pyralidae, commonly called pyralid moths,[2] snout moths or grass moths,[3] are a family of Lepidoptera in the ditrysian superfamily Pyraloidea.”
    WP says: “The Ditrysia are a natural group or clade of insects in the lepidopteran order containing both butterflies and moths. They are so named because the female has two distinct sexual openings: one for mating, and the other for laying eggs (in contrast to the Monotrysia).”

  126. His method makes sense…Semitic is conservative, right? Arabic, Akkadian, same thing. Hebrew too. Nothing changed. And all those Chadics Berbers, Egyptians… they are just deviants. The Tower of Babel must have happened within this period of time…. So we can explain Tai-Kadai!

    Is the professor’s method better or worse at explaining Hebrew / Tai-Kadai? Than AA etymologies at least? It is all par par….

  127. David Eddyshaw says

    Which professor, drasvi? I’ve lost track …

    The idea that language complexity (particularly in morphology) must represent historical conservativism is far from dead. It presumably arose in the context of Indo-European, with the notion that our modern languages have all degenerated from wonderful Sanskritoid predecessors (which, as a historical accident, happens to be fairly true, thus compounding the confusion.) As I complain at every opportunity, the idea still distorts thinking about Niger-Congo.

    Of course, by this logic, (Classical) Arabic (certainly not Hebrew) just has to be effectively identical with the primaeval AA language, if not the original language of all humanity. (An argument that I believe has actually been made, and not just on evident religious grounds.)

  128. Isaac Mozeson🙂

    Perhaps he is not a professor…

  129. Of course, by this logic, (Classical) Arabic (certainly not Hebrew) just has to be effectively identical with the primaeval AA language, if not the original language of all humanity.

    I think Arabic won’t work for religious reasons (though Semantic Scholar just offered me: The Characteristics of the Letter of Dād and the Miracle of Al-Qur’an: “…This paper elaborates some specificities and uniqueness of Arabic letter Dād that no other languages in this world bear similarities. …”).

  130. David Eddyshaw says

    Isaac Mozeson

    Ah. I should read the OP. The mind-boggling lunatic …

    Arabic won’t work for religious reasons

    On the contrary, I believe that the orthodox Sunni view of the Qur’an is that it is uncreated and eternal, and it is of course in Arabic. So that seems to establish a prior claim …


    I seem to recall that there were also mediaeval (Arabic-speaking) Jewish scholars who agreed that Arabic was closer to the original speech, though.

  131. But the title: “The Origin Of Speeches: Intelligent Design in Language”

    Всем сёстрам по серьгам (“earrings for each sister”). It is when you say what you think about everyone.

  132. “To all sisters by/across earrings” actually.

    po has a distributive meaning.
    to give each-DAT po apple-DAT means there are 5 people in the room and you extracted 5 apples from your pocket and gave one apple to every person.

  133. David Eddyshaw says

    Intelligent Design in Language

    He should get together with Noam Chomsky. They seem to share a fundamental misunderstanding about the nature of human languages (though ANC merely supposes that Language is Optimal.)

  134. @DE, I was browsing Stolbova’s Chadic database (vol. 1 on archive). I beleve it is meant to be used as a database of comparanda, she grouped them by reconstructed roots and presumably she reconstructed those roots the best she can.

    I have very vague understanding of AA reconstructions and mostly my understanding is that not much is understood, but it seems many do believe that Semitic (or else Semitic-Egyptian-Berber) preserves lots of stuff better (lexical material and not only). The method “Take some Arabic root and compare it to a Chadic word that starts from the same letter” is not uncommon.

    But that’s quite similar to Mozeson’s approach….

  135. David Eddyshaw says

    The reconstruction of the Proto-Afroasiatic lexicon is in a terrible state, with the two major published dictionaries agreeing on very little and each riddled with serious methodological errors.

    The idea that Semitic preserves the original state of affairs best is an artefact of Eurocentrism, the long documented history of the group, and the much less extensive documentation of the Chadic and Cushitic languages. In fact, there is good reason (for example) even from Semitic alone to suggest that its wonderful algebra-neat triliteral root system is a secondary development.

    The basic problem is a systematic failure to accept that with such a time depth it is actually not possible to reconstruct much any more. Too much entropy … Instead, the annoying gaps are repeatedly filled with poorly evidenced conjecture and special pleading.

    However, all this is not similar to Mozeson’s approach. You’re talking about the difference between (admittedly very bad) astronomy and astrology.

  136. David Eddyshaw says

    In fact, reconstructions of Proto-AA typically suffer from all the faults you see in speculative long-range comparative work, and for much the same reasons; the only real difference is that there is enough there that AA seems at least to be actually real, despite it all.

    It amazes me that apparently authoritative figures in comparative work in Africa have seemed so often to have so little grasp of the basics. A case in point is Omotic, over and over found in lists of branches of AA as if there were no doubt on the matter. In fact, there’s no worthwhile evidence that it belongs to AA at all: Rolf Theil shows this pretty conclusively in the paper that Hat linked here:


    Large-scale comparative work in Africa is in much the same state as Indo-European studies would be if the major authorities in the field listed Hungarian and Turkish as Indo-European.

  137. You’re talking about the difference between (admittedly very bad) astronomy and astrology.

    I would not call it “bad”, I understand why it happens. The criticism is deserved, but they are doing their best and it is good that they are doing it.

    The idea that Semitic preserves the original state of affairs best is an artefact of Eurocentrism, the long documented history of the group, and the much less extensive documentation of the Chadic and Cushitic languages” – I am not sure if it is an idea. Arabic comparisons appear because Arabic is actually conservative and because we do not have proto-Cushitic. But we end up in a place where Adam and Eve speak more or less what Mozeson believes they do:/

  138. “In fact, there is good reason (for example) even from Semitic alone to suggest that its wonderful algebra-neat triliteral root system is a secondary development.” – which makes reconstruction harder.

  139. However, all this is not similar to Mozeson’s approach.

    IF a freind of mine wanted to do it in Mozeson’s way, I would not say that “a belief that there was the Babel Tower is absurd” – beliefs can’t make anything “scientific” or not.

    If someone believes in that and in that Adam and Eve spoke Hebrew, she of course can start from there, why not. I would look at his methods.

  140. David Eddyshaw says

    which makes reconstruction harder

    Indeed it does. But just because something is easy, that doesn’t make it true.

    It may in fact make reconstruction quite impossible sometimes. But then, nobody can really say they’re a scientist unless they have the ability to say: “You know what? I just don’t know.”

    Arabic is actually conservative

    Aren’t you assuming your conclusion? I mean, it’s conservative – in some respects, not all – compared with other Semitic languages; but that’s all we really know.

    beliefs can’t make anything “scientific” or not

    Trouble is, if your no doubt impeccably logical system depends on believing in a literal Tower of Babel as a premise, there is no reason for anybody who does not accept your premise to accept any of your conclusions. Most scientists aspire to a wider audience.

    More fundamentally, some beliefs are surely inconsistent with anything recognisable as modern scientific method, because they regard the method itself (involving, as it does, always being prepared to accept that you might be wrong) as unacceptable.

  141. Trond Engen says

    Not relevant to anything, but I got to see Rolf Theil live on stage just this Friday.

  142. David Eddyshaw says

    “Lena er en selverklært språknerd.”

    Hey, Norwegian is easy

  143. John Cowan says

    “Rolf speaks 50 languages, is originally from Eidanger, and after 50 years “inside” he has made a bambling of himself.”

    Do what?

  144. Presumably an inhabitant of Bamble (the second b is silent).

  145. Trond Engen says

    That’s right. After 50 years innafor, he has moved to Bamble.

    Innafor is a local geographic term denoting the area around Oslo. It contrasts with oppafor, bortafor, utafor, nerafor, and — obviously — hjemmafor. The exact referents of the terms were discussed in the show.

  146. John Cowan says

    If GT had said “made a Bambling of himself”, I would have some chance of understanding it; cf. “made a New Yorker of himself”, though the expression is a bit peculiar. But in lower case I took it to be some English-language noun I didn’t know, perhaps meaning ‘idiot’.

    As for “inside”, a 50-year prison sentence would be pretty surprising.

  147. Trond Engen⁹ says

    Yes. Norwegian ‘innafor’ can mean “in prison”, and ‘utafor’ “out of prison” (or at least it could in the presumably dated prisoner’s slang I was exposed to in the formative popular culture of my youth). ‘Utafor’ can also mean “out of it” as in “not functioning at full capacity because of sickness, distress, or whatever”. Lots of fun can be had, and some was in the show.

  148. Trond Engen says

    * Oh, so that’s where the asterisk went when I removed it from the second paragraph of my previous comment.

  149. Trond Engen says

    I forgot to add that innafor can also mean “informally acceptable”. Er det innafor å gå uten å si ha det? “Is it OK to leave without saying good bye?”

  150. David Marjanović says

    On Lameen’s post that started it all, all currently anonymous comments *shaking fist in general direction of Google* except the first are by me, and I still think they’re accurate almost without modifications.

    The basic problem is a systematic failure to accept that with such a time depth it is actually not possible to reconstruct much any more. Too much entropy …

    I don’t mean to defend the existing “reconstructions” – at least one of them really was made by going through a dictionary of Arabic and looking for cognates to every entry! – but… I see no evidence for this conclusion. What is obvious is a stark lack of effort on Chadic – it’s only been a few years that a reconstruction of Proto-Central-Chadic was published, let alone a Proto-Chadic one! – and even more so on Cushitic.

    Uralic showed a century ago that it’s entirely possible to reconstruct through 5000 years in the complete absence of written records older than a few centuries. That kind of work just hasn’t been done on most potential subbranches of AA yet, let alone on the whole thing.

  151. David Eddyshaw says

    Fair enough. Comparative work is a niche sport with African languages, I must admit, even more so than elsewhere.

    However: AA can scarcely be less that 8000 years old, quite probably a fair bit more. It surely isn’t reasonable to expect the kind of plentiful rigorous results that we see in mere youngster groups like Indo-European or Uralic.

    Same applies to Niger-Congo, especially in its more imperial incarnations, including all of “Atlantic” and even Mande. Even its Bantu twiglet (subgroup of a subgroup) can hardly be less than three thousand years old.

    And the lack of progress (it seems to me) is not simply from the lack of workers in the vineyard; it’s that the languages being compared are so very disparate that plausible potential cognates are thin on the ground. You can’t make bricks without straw.

    There’s also been a self-defeating tendency in Niger-Congo work to leap to high-level comparisons before doing the work on lower-level groups adequately (for example, John Stewart’s work comparing Proto-Potou-Akanic with Proto-Bantu and Fulfulde; Stewart, who had a lot of the right ideas and set about his comparative work in a basically reasonable way, evidently had a sore conscience on this point, and spends some time defending his position on it. In vain, if you ask me …)

    [Of course, it doesn’t help that the actual division into low-level groups has itself been quite impressionistic and unrigorous: exhibit A: “Gur.” Stick your noun class affixes at the end instead of the beginning, and be spoken somewhere from western Nigeria to eastern Côte d’Ivoire, and – hey presto! – you’re a Gur language.]

  152. There’s also been a self-defeating tendency in Niger-Congo work to leap to high-level comparisons before doing the work on lower-level groups adequately

    I get the impression this is the case for a lot of high-level comparisons. It’s so tedious to sit around niggling at local comparisons when you want to be up on top, with a view to the horizon…

  153. at least one of them really was made by going through a dictionary of Arabic and looking for cognates to every entry!

    Arabic dictionaries are thick.

  154. “is not simply from the lack of workers in the vineyard;”

    @DE, but people are a necessary condition. To have good good reconstrutions for minor branches you need people. To have good reconstuction for larger branches you need lots of work (including documentation) done with lesser branches.
    Without those how do you know what we can learn about proto-AA?

  155. David Eddyshaw says

    Uralic showed a century ago that

    In fairness to Chadicists, one might also point out that the Uralicists had a pretty significant start on them in terms of available data. The only Chadic language there were any but meagre data for in those days was Hausa (and people hadn’t even worked out that it was a tone language yet.)

    but people are a necessary condition

    Sure. Lack of interested scholars is a big problem (exacerbated, no doubt, by the Chomskyan devaluing of such work and undercutting of its sheer financial viability by diverting funds and employment into their own academically sterile fantasy world.)

    I’m just saying it’s not the only problem. Some comparative work just is less tractable of itself than other comparative work (though DM might reasonably claim that that we don’t actually know yet in this particular instance whether it will turn out to be so.)

  156. though DM might reasonably claim that that we don’t actually know yet in this particular instance whether it will turn out to be so” – Yes, that’s what I meant.

    Lack of interested scholars is a big problem ” – Organize a lingustic olympiad or somethign for schoolchildren in the region. Identify 200 interested children. Organize a summer school. Call it a university. Convince everyone to work for free… (the imperative does not imply that it is DE who must do this:))

  157. David Eddyshaw says

    Convince everyone to work for free

    I’m glad that the imperative does not imply that it is I who must do this …

    (If I actually possessed this extraordinarily useful persuasive skill … reminds me of the old saying “If you’re so smart, why ain’t you rich?”)

  158. You are doing this (by your personal example).

    Though, seriously, I hope there are career opportunities for a skilled linguist in Cameroon.

  159. David Eddyshaw says

    The main employer of skilled field linguists in Cameroon is almost certainly SIL, and/or local affiliates.

    My own direct experience with SIL is limited to the staff of GILLBT (their indigenised Ghanaian offshoot) in Tamale. They were extremely helpful, so far as they were able (though I think, a bit bemused by me.)

    My impression of SIL’s purely linguistic work is that they have (to generalise wildly) got miles better at it since the dark days when all was Tagmemics* (ugh.) SIL also seem to have learnt from mistakes made in the past regarding cultural sensitivities and suchlike.


    And at least they’re doing fieldwork, as opposed to sitting in comfortable offices meditating on the beauty of Merge.

    * Frankly, everybody has got miles better at it. Modern grammars of out-of-the-way languages tend to be several cuts above similar works of a generation ago.

  160. Am I the only one who has always found,

    You can’t make bricks without straw,

    to be a profoundly weird aphorism? Because, of course, you can make bricks without straw, and people have been doing it since the Stone Age. It is only one particular type of sunbaked mud bricks that includes straw as a major binding ingredient. Other unbaked bricks have used different binders, including many different kinds of plant matter (although straw may have always been the most common). Moreover, modern kiln-fired bricks* generally use no non-mineral binders at all.

    * Neither “kiln bricks,” nor “fire bricks” would be correct here. Those terms refer to the grades of kiln-fired bricks that are used to construct high-temperature enclosures like fireplaces, wood-fired ovens, and kilns themselves.

  161. It works well with any obscure obsolete/rural art. “You can’t shoe a mule without charcoal.” “You can’t make barley malt without dandelions.” “You can’t thatch a roof with only three hands.” Who am I to say otherwise?

  162. i really like that it’s so local and specific! just wonderful how precisely you can place a phrase…

    in yiddish, one of the ways to say “since time immemorial” – a way that i’ve seen at large in the pages of the Morgn Frayhayt, the long-running new york communist daily paper, mind you – is “since the time of jan sobieski”. which means: since the year 1696 of the christian reckoning. it’s delightful!

  163. David Eddyshaw says
  164. @Y: The adage might actually seem more reasonable to me if it were more obscure, like the other jocular ones you listed. However, the scriptural origins of “bricks without straw” are too much in the foreground for the expression to sound like vaguely incomprehensible folk wisdom.

  165. John Cowan says

    one of the ways to say “since time immemorial” […] is “since the time of jan sobieski”. which means: since the year 1696 of the christian reckoning.

    “Time immemorial” actually means “before September 3, 1189 C.E.”, as that was the original beginning of legal memory: if something had been in your family since that date (the coronation of Richard the Lion-hearted) you did not have to prove how it originally came to be there. Per contra, if your family had not been in possession since that date, it didn’t help to prove they had held it earlier. There were certain exceptions: whether land was part of the ancient demesne of the crown (an estate in villeinage held directly of the King/Queen, which had special rights attached to it) could only be established by reference to Domesday Book (1086).

  166. David Marjanović says

    Any chance that “the time of Jan Sobieski” is a reference to the Second Turkish Siege of Vienna, which he ended in 1683?

    In fairness to Chadicists, one might also point out that the Uralicists had a pretty significant start on them in terms of available data.

    That’s a part of my point that I failed to make explicit. The Uralicists have long emphasized data collection to a degree barely dreamt of among IEists – check this out and weep.

  167. David Eddyshaw says

    That part of your point I am in total agreement with. As a well-known linguist* remarked “It is a capital mistake to theorise before one has data.”

    * Conan Doyle curiously neglects Holmes’ ground-breaking monograph on the languages of the Andaman Islands.

  168. J.W. Brewer says

    I find it highly unsurprising that a Yiddish phrase reflecting historical memory in the Pale of Settlement that might be loosely glossed into English as “since time immemorial” does not accurately track the technical meaning of “time immemorial” embedded in the history of England-specific legal jargon. And I do mean England-specific rather than English-specific, because although the U.S. and other Anglophone polities mostly inherited the same legal tradition, the “since 9/3/1189” sense was useless for adjudicating actual real-estate disputes in the New World and thus functionally meaningless in AmEng.

  169. i glossed “biz yan sobyeskis tsaytn” the way i did to invoke english law’s “biz rishard eyntss tsaytn”.

    i don’t know when the phrase began to be used, but i assume it would’ve been sometime after the first partition of poland, when sobieski’s reign would’ve seemed like a lost age of stability. i doubt it had much (if anything) to do with his exploits rescuing vienna – the rzeczpospolita got its good reputation among jews because of its religious tolerance, but the ottomans were a lot more welcoming than the hapsburgs during that whole period (and in retrospect probably a better overall bet than poland-lithuania for avoiding pogroms and blood libels, which the polish clergy kept alive as a bloodsport practically* singlehandedly after pope paul iii’s condemnation).

    * in the late 1700s they did get some help, depressingly enough, from jakob frank and his supporters. that was one of the more sordid parts of the whole frankist saga, which i now know a little too much about thanks to olga tokarczuk’s The Books of Jacob, immediately followed by paweł maciejko’s The Mixed Multitude. i recommend both, though neither will make you happier about the world. (the cameos from casanova and the illuminati might be worth it)

  170. Aren’t you assuming your conclusion? – I see what Akkadian looks like to me… But no, I do not mean that every Arabic root must have proto-Semitic antiquity. I just mean, it is understandable.

    Trouble is, if your no doubt impeccably logical system depends on believing in a literal Tower of Babel as a premise” – if a system has not descriptive value, it is not science, of course.

    But the same true for systems based on other beliefs. If your belief affects nothing, it affects nothing. If it affects the direction of search… you risk wasting your time if your premise is false (cf. trying to prove a false conjecture), but if you possess enough curiousity for the data as such, there is chance you will discover something.

  171. PlasticPaddy says

    Certain inquiries are more compromised by investigator bias than others. In the case of AA protolanguage, it would appear that (1) there is insufficient data for reliable reconstruction of at least some features using the comparative method (2) bias of past (and present) investigators has affected the process of data collection and publication. So it is possible that some consolidation is needed, so that future investigations (by investigators with various and more or less pronounced biases) can proceed on a sound basis.

  172. @PP, I agree. What I meant is that the authors of AA reconstructions are not exactly interested in promoting Arabic.

    We have Bronze Age Semitic and Egyptian. The next level is extrapolation from Arabic specifically… In lexicon, Arabic/Hebrew/Ge’ez are somewhat conservative in form compared to IE. Which does not make forms conservative in meanign or resistant to replacement (Arabic and Akkadian Swadesh list won’t show many matches).

    Arabic exmaples are understandable, but yes, there is a problem.

  173. @PP, actually I began from comparing Mozeson (a guy who takes the Tover of Babel very seriously, believes that Hebrew “influenced” Malay and judging by these two facts and the name “Edenics” apparently thinks that Hebrew was spoken in Eden…) to AA reconstruction, so I must be the attacking side.

    But I did not mean to attack, I just was reading Stolbova’s Chadic database and dictionary*, seeing Arabic, rememebering AA reconstrcutions that I read before and thinking: “Now Mozeson makes sense…“.
    Arabophone Eve and Hebrew-speaking Eve are not too different, two girls would be certainly able to agree on something. It was just a humorous observation.

    * https://book.ivran.ru/f/ilovepdfmerged.pdf
    I guess it is obtained from her Chadic database by excluding certain entries.

  174. I wonder, though, what are chances for a random Arabic root to be proto-Semitic. After all, if you have a word, it is either retention or borrowing or derivation. Now, when a borrowing is from another Semitic language it still can be a Semitic word. When it is derived from another word, but the consonantal root is intact, it is intact.

  175. the Uralicists had a pretty significant start on them in terms of available data. The only Chadic language there were any but meagre data for in those days was Hausa

    As past me in the post linked by DM has kind of pointed out already, it’s not that we “had” data from the get-go; but that the Comparative Finno-Ugric enterprise got a first attempt at a start already in the mid-18th century. And yes, then spent 100 years not going anywhere much due to a dearth of data (though well, also, a dearth of methodology). In some ways it looks similar to what I see reported as being known for PAA: “yeah we know about 50 good cognates and then 50 more speculative, enough to tell it’s a family but not enough to do much more”.

    Then again, why cast out all sights and lures for only the absolute biggest fish? A decent reconstruction of e.g. Proto-Cushitic, if that were to fare better, would already have a lot to tell about the prehistory of the region. (Though then I did take a course Comparative Cushitics a few years ago and it left me with the impression there is no solid reason to think of it as anything else than an areal unit of 4–5 Afrasian subfamilies, or indeed, some non-Afrasian with long-standing contacts, as in the case of Omotic … maybe substitute Proto-East Cushitic as your target instead if you’re of a particularly skeptical bent.)

    As for Semitic, I get the impression that what esp. Arabic sorely needs is rather loanword research — dictionaries thousands of pages thick filled only with allegedly inherited Proto-Semiticisms does not pass a smell test, especially for such an areally major but relatively newcomer language. But the productivity of the triliteral morphology can, surely, easily disguise even recent loans.

    On Omotic, did I mention some of my impressions from Theil’s newer paper yet? … let me go check the Omotic thread and get back to one of these then.

  176. It is the fascinating fact about Arabic that I learned immediately when I become interested in the langauge: it does not have an etymological dictionary.

    %% of loans can be high or low depending on what are your neighbours. Obviously if your only neighbours are related languages, then all your borrowings will be from related langauges. English and French have been filled with mostly Indo-European loanwords….

    – intrasemitic loans: this is where Semitic is quite unique: it is hard to recognize a borrowing within the family.

    – borrowings from other languages: I guess, the geographical calculation for a word attested only in one branch of Semitic is different from that for a word attested only in Indic languages, especially if this branch is Arabic rather than Ethiopian. I do not know what this calculation should be. Just different.

    Technically recent loans from known languages are often easy to recognize. Old loans from unknown sources… Are those easy in any language?

  177. David Eddyshaw says

    intrasemitic loans

    There are actually quite a few in Classical Arabic, mostly from Aramaic.

  178. David Eddyshaw says

    why cast out all sights and lures for only the absolute biggest fish?

    Why indeed …

    This question should be embroidered in letters of gold and hung prominently on the wall of the study of everyone doing comparative linguistics of African languages. Or of any languages, I guess …

  179. “There are actually quite a few in Classical Arabic, mostly from Aramaic.”

    Absolutely. It is recent loans from a known language. And it is recognized loans. (It would be interesting to compare lists of Aramaic Aramaic and Aramaic Greek loans …).

    But what are the implications for reconstruction? Detecting words that come from outside the family. Tracing their evolution within Semtic, semantical and formal (for words derived from Semitic material). And then for inherited roots there is a handful of consonants whose reflexes are different.

  180. Indentifying even old loans is reasonably easy when your old source happens to be attested, much as Old Norse and Gothic helped kickstart a lot of the Germanic loanword research in Finnic. Topics like Samic loanwords only really came round much later. Which is to say, the harder part is drawing lessons from your known loanword layers to then locate loans from unattested-but-reconstructible sources … Anything from Proto-Modern-South-Arabian, for example? (Does anyone even know anything nontrivial about PMSA yet?)

    Some of the alleged evidence for Nostratic is relatively often centered on South-ish Dravidian + Arabic-and-neighboring Semitic. If I knew the involved languages better, I’d probably want to try if a bunch of those can be recast as a loanword layer in some direction instead.

  181. Proto-Modern sounds as a name of an artistic style. Probably becase Art Nouveau/Jugendstil is called “modérn” in Russian.

  182. For proto-MSA, the work of Julien Dufour comes to mind – and Kogan et al of course. But it’s not something I’ve delved into deeply.

  183. David Eddyshaw says

    Indentifying even old loans

    There’s a similar problem in Kusaal. Most loans are from Hausa, which is (a) phonotactically very different from Kusaal and (b) extremely well documented itself. So they are easy to spot.

    It’s another matter when it comes to detecting loanwords from other Western Oti-Volta languages. There are some dead giveaways, as with loans that haven’t undergone the characteristic Kusaal loss of final short vowels, or violate Kusaal phonotactics in some way; for example, one of my informants used kiibu “soap” instead of the “real” Kusaal word ki’ib /kɪ̰̃:b/, and the vowel quality and length make it certain that that form is from Mampruli kyiibu; and the word for “Saviour” in the Christian sense, faangid /fã:gɪd/, violates a rule which deletes *g after long low vowels, which seems to have exactly one (itself mysterious) exception in the whole of the rest of the language: it must be a loan from Toende fãagɩt, and/or Mooré fãagda. But most such loans will have simply blended seamlessly into the general population and will be forever undetectable …

  184. Kogan’s chapter Proto-Semitic Lexicon in a Semitic handbook has several pages titled How to detect inter-Semitic loanwords (disappointingly, only a short paragraph for borrowings from elsewhere). “What follows is an attempt at a critical synthesis of Kaufman 1974, 19-22, Leslau 1990, XI-XIV and SED I L-LVII where this very important question has been dealt with in some depth.” it says. Kaufman 1974 is The Akkadian Influences on Aramaic. , Leslau 1990 is Arabic Loanwords in Ethiopian Semitic. and SED I is Semitic Etymological Dictionary by Kogan and Militarev. The handbook is on libgen and numerous pdf-sharing sites, just like SED. It is just a brief textbook overview, but as my own understanding of the situation is well below that…

  185. David Eddyshaw says

    ki’ib should be /kɪ̰:b/, sorry.

    You know how it is: you add one diacritic, promising yourself that you know when to stop, that you’re in control – and then, all of a sudden, you find that you’ve added another ….

  186. It is the fascinating fact about Arabic that I learned immediately when I become interested in the langauge: it does not have an etymological dictionary.

    I complained about this back in 2004; we discussed an abortive project in 2019.

  187. /kɪ̰̃:b/ unexpectedly looks good.

    More balanced than you /kɪ̃:b/ or /kɪ̰:b/…

  188. David Eddyshaw says

    more balanced

    Unfortunately, Kusaal lacks nasal /ɪ̃/, except as a result of loss of *n before /s/ and /f/, and there are no cases with glottal vowels: piif /pɪ̃:f/ “genet” (from *pɪ:nfʊ) is the best I can do.

    On the other hand, if you’re prepared to settle for /i/, I can offer you zin’ig /zḭ̃:g/ “place”?

  189. @LH I recently noticed this:

    Great, thanks! I’ll add it to the earlier post.

  190. David Eddyshaw says

    Piinf /pɪ̃:f/ “genet.” (Of course.) More of this and I shall have to give my Kusaal Spelling Certificate back.

  191. /zḭ̃:g
    I think it is fine (of course dotless ɪ is more balanced, but…).

    I am now seriously considering the idea that the air of sophistication (or complication…) associated with diacritics is a product of ruined balance. Spaniards balance their question marks (and not diacritics). Nut exactly ñ is fine because n itself is tildoid…

  192. Stu Clayton says

    Piinf /pɪ̃:f/ “genet.” (Of course.)

    I hope it is not genêt to which you refer. In which case the French Spelling Certificate could be in peril.

  193. David Eddyshaw says

    Oops. Piif “genet”* actually has an oral vowel, despite the loss of *n: /pɪ:f/.

    I have failed you, drasvi. I can only say that I am truly sorry. And ashamed.

    * Yes, “genet”, Stu. Things have come to a pretty pass in England when a man in his own home is expected to put fancy foreign circumflexes on his own words … but actually, it’s this


    (which is genette in cross-Channellese.)

    I wonder where the Arabic زريقاء comes from?

  194. Stu Clayton says

    the air of sophistication (or complication…) associated with diacritics is a product of ruined balance.

    I think you’re onto something ! In Czech, for example, diacritical parsley is sprinkled on lavishly without regard to the underlying glyphs. Don’t get me going on IPA.

  195. Stu Clayton says

    truly sorry. And ashamed.

    That would be gêne, I believe. Another unbalanced circumflex. This is supposed to be the ghost of an “s” that fell by the wayside, as I read many years ago. But it hardly ever is.

    Either my source failed me, or someone was taking the mickey.

    On the bright side, genêt is German Ginster, so it looks like there was an “s” there in previous ages. Not in “broom”, though. No, wait: “besom” !!

  196. David Eddyshaw says

    How about zʋn’ʋf /zʊ̰̃:f/ “dawadawa seed”? I know it’s not the same, but …?

  197. David Eddyshaw says

    genêt is German Ginster

    and genet-the-animal is Ginsterkatzen


  198. Stu Clayton says

    I am reconciled.

  199. /zʊ̰̃:f/

    Wonderful! Is ʊ a pot? A fire beneath the pot and… no, waves (rather than steam) above.

  200. David Eddyshaw says

    The word has high tone, so I suppose you could put on a third diacritic:


    Not very symmetrical, unfortunately. Pity it doesn’t have mid tone.

  201. David Eddyshaw says

    Come to think of it, though, in e.g. pu’a la zʋn’ʋf “the woman’s dawadawa seed” the tone changes to high-low falling:


    and symmetry is restored …

  202. That is a thing of beauty.

  203. David Marjanović says

    But the productivity of the triliteral morphology can, surely, easily disguise even recent loans.

    It must have been Lameen’s blog where I learned that the plural of film is aflām and that of bank is bunūk

    Wonderful! Is ʊ a pot? A fire beneath the pot and… no, waves (rather than steam) above.

    It’s almost a samovar.

  204. For a samovar we need a сапог diacritic.

  205. (not sure if samovar users use a high boot very often, but as the whole construction looks comical, caricature samovar is often depicted with a sapog on it)

  206. I wonder where the Arabic زريقاء comes from?

    A hyperlink to this thread on another thread just brought me here. I will attempt an answer to this question.

    In form, زريقاء zurayqāʾ is the regularly formed diminutive of zarqāʾ, feminine of ʾazraq “blue, grey, greenish (of eyes); bright, shining (as the tips of spears or arrows); clear (of water)”, built to the usual pattern ʾafʿal for adjectives of color and physical characteristics. As a substantive, according to the dictionaries, ʾazraq can mean “leopard” (as “the grey one, the pale one”). On the formation of the diminutive of feminines, see for example W. Wright, A Grammar of the Arabic Language, p. 169, §271 here. So this designation of the genet possibly refers to the flashing of its eyes in the dark—or perhaps it simply means “the little leopard”?

  207. David Eddyshaw says

    “Little leopard” is not a bad name for the genet. That seems very plausible. Thanks again, Xerîb!

  208. A comment by Alon Lischinsky gives us the Guaraní word panambi ‘butterfly.’

  209. David Eddyshaw says

    Looking back over this thread, I realise that I must put things straight regarding the Kusaal word piif “genet.” It hasn’t lost a n in the singular. There never was any n in the singular, Comrade. The plural piini contains a relic of the obsolete plural ending ni (from *ɲi.)

    Apologies for any confusion.

  210. Since I happened to see this thread come up while reading Richard K. Nelson’s 1983 Make Prayers to the Raven: A Koyukon View of the Northern Forest:

    Butterflies and Moths
    The Koyukon name these lovely insects collectively nidinlibidza, “it flutters here and there.”

  211. Lars Mathiesen (he/him/his) says

    Well, that’s Lepidoptera which is an actual clade, so well spotted, that language! (Or maybe Ditrysia, or maybe the point is moot if none of the basal clades are represented in Alaska).

    (Recently I found out that in Danish entomological usage, sommerfugl is Lepidoptera, dagsommerfugl is Rhopalocera, the paraphyletic rest is natsværmere/natsommerfugle (quondam Heterocera) and møl ~ ‘moth’ seems to be anything with a vernacular name ending in -møl, [so extremely paraphyletic–not overlapping with Rhopalocera, it seems, but that’s probably just a coincidence]. FWIW, WP.da møl links to WP.sv Microlepidoptera [not-a-clade], but maybe it should link to Tineoidea [Malfjärilar] — and not to Tineidae [Äkta malar] which is Egentlige møl in Danish but a redlink in WP.da. But in general, what different languages/cultures define as ‘a moth’ seems to vary a lot).

    EDIT: The semantic equation between Da møl and E moth is taught in school here, but the larger moths (the proverbial candle flame ones) would actually be called natsværmere in Danish.

Speak Your Mind