Harvard Sentences.

The Harvard Sentences are a set of phonetically balanced sentences used for testing audio circuits. If you’ve ever wanted to hear them spoken aloud, the Open Speech Repository has you covered: American English, British English. They also have files in Mandarin, French, and Hindi. (Thanks, Trevor!)


My wife asks interesting questions about words, and the most recent was “Why do we say ‘bedridden’?” I opened my mouth, realized I didn’t have anything useful to say, and turned to the dictionary. The answer is simple but unpredictable, and since others may well be interested, I’m sharing it here. The Online Etymology Dictionary has a good summary:

bedridden (adj.)
also bed-ridden, mid-14c., from adjectival use of late Old English bæddrædæn “bedridden (man),” from bedrid, from Old English bedreda, literally “bedrider, bedridden (man),” from bed + rida “rider” (see ride (v.)). Originally a noun, it became an adjective in Middle English and acquired an -en on the analogy of past participle adjectives from strong verbs such as ride.

So it was originally ‘bed-rider,’ which makes sense, and due to the sort of morphological scrambling languages are subject to, it looks like it means ‘ridden by a bed,’ which doesn’t.

The Barbarian Beard.

Victor Mair has a post at the Log full of the kind of detailed historical/philological investigation I love. A correspondent wrote that “in Chinese the word for beard (胡子) has an archaic root meaning ‘foreign,’” and Mair, who had long “wondered if all of these expressions [húshuō(bādào) 胡说(八道) 'nonsense; ridiculous; bullshit,' húrén 胡人 'barbarian,' húzi 胡子 'beard,' húnào 胡闹 'act wild; be mischievous,' etc.] … had something to do with wild, bearded barbarians from the west,” decided to look into it. He says:

… it begins to get really tricky, because it is possible that certain non-Sinitic peoples to the north and northwest were thought of as hú 胡 because they had hú 胡 (“beards”) and that these hú 胡 folk behaved in a very hú 胡 (“wild; uncontrolled; unruly”) fashion. But this is a semantic and etymological minefield upon which we must tread cautiously.

Some points to consider:

1. The earliest meaning of hú 胡 is generally considered to be “tissue drooping down under the chin of an animal (e.g., dewlap)” — note that the character has a “flesh” radical.

2. By extension, it came to mean “part of a weapon that hangs down”, and this is probably also how the meaning “beard” arose (“the pendulous mass of hair under a man’s chin”).

3. Hú 胡 also developed the meaning of “neck” (the part of an animal behind the thing hanging down) and “broad; large”, which I’ve written about extensively in Victor H. Mair, “Was There a Xià Dynasty?“, Sino-Platonic Papers, 238 (May, 2013), 1-39. See esp. p. 9 where the Old Sinitic reconstruction of hú 胡/鬍 (“beard; bearded person”) is given as *’ga (in Jerry Norman’s spelling system according to David Branner), together with cognates in Tibetan.

There are a bunch more points, some speculation, and an image of “a band of musicians with a dancer on top of a camel’s back.” Check it out.

Jabotinsky’s Hebrew.

I’ve started Halkin’s Jabotinsky: A Life, which is excellent (thanks, Paul!), and I thought this passage on language was worth posting:

Jabotinsky also covered the congress for Odesskaya Novosti, in which he published four long dispatches. The first two dealt with caucuses he attended. One was held by the Mizrachi, the religiously Orthodox Zionist party; struck by its moderateness, he deemed it capable of collaborating with secular Zionists. The other was convened by a Hebraist faction that demanded Hebrew’s adoption as the official language of the Zionist movement and of a future Jewish state. (The congress itself was conducted in German, with delegates free to use Yiddish, Russian, or Hebrew if they wished.) While confessing that he did not understand spoken Hebrew well enough to follow the proceedings, Jabotinsky was impressed by the speakers’ fluency and predicted that their goal would be accomplished in Palestine because Hebrew alone could serve as a lingua franca there; he was also struck by the Sephardic diction used by some of them, which he judged more exact and pleasing than the Ashkenazi pronunciation he was familiar with. The experience spurred him to take up the study of Hebrew again.

A quibble: while the book is in general very well proofread and copyedited, it consistently uses “Odesskaya Novosti” (‘Odessa News’) for what should either be Odesskiya Novosti (representing the prerevolutionary spelling) or Odesskiye Novosti (the modern version); as it is, it matches a feminine singular adjective with a plural noun. Tsk, I say, tsk.

Icelandic: On the Brink?

Patrick Cox has a “World in Words” segment called “Will Icelanders one day ditch their language for English?” Needless to say, Betteridge’s law of headlines applies, but it’s a fun read:

“When I was growing up, very few people spoke English,” says Gnarr. “With my generation, through TV and music it became necessary to understand English.”

Gnarr’s children speak much better English than he does. They have friends all over the world who they converse with on social media.

“But they don’t speak as good Icelandic as I do,” says Gnarr. “It’s a drastic change in a very short time.”

The conclusion is clear: Icelandic, like everything else, is going to hell in a handbasket. And of course there are the purists who “believe that the best chance for survival would be to resist importing words from English, and to hang on to the language’s archaic and complicated grammar.” Good plan, purists! (Hat tip for the link goes to Trevor.)

The Indo-European Controversy: An Interview.

George Walkden at New Books in Language:

Who were the Indo-Europeans? Were they all-conquering heroes? Aggressive patriarchal Kurgan horsemen, sweeping aside the peaceful civilizations of Old Europe? Weed-smoking drug dealers rolling across Eurasia in a cannabis-induced haze? Or slow-moving but inexorable farmers from Anatolia?

These are just some of the many possibilities discussed in the scholarly literature. But in 2012, a New York Times article announced that the problem had been solved, by a team of innovative biologists applying computational tools to language change. In an article published in Science, they claimed to have found decisive support for the Anatolian hypothesis.

In their book, The Indo-European Controversy: Facts and Fallacies in Historical Linguistics (Cambridge University Press, 2015), Asya Pereltsvaig and Martin Lewis make the case that this conclusion is premature, and based on unwarranted assumptions. In this interview, Asya and Martin talk to me about the history of the Indo-European homeland question, the problems they see in the Science article, and the form that a good theory of Indo-European origins needs to take.

At the site you can hear the hour-long interview. Thanks, Trevor!


Back in 2007 I posted about an old Russian epithet for Greeks, пиндос [pindós], that has come to be directed at Americans; in reading Serafimovich (see this post) I’ve run across another one, грекос [grekós], which is obviously straight from Greek γραικός [γrekós]. The ragged elements of the Red Army (with associated sailors, families, and livestock) are making their hungry way south along the Black Sea coast, and when they run across potential supplies they’re not shy about availing themselves of them. They happen on a colony of Greeks: “За то, что это не свои, а грекосы, позабрали всех коз, как ни кричали черноглазые гречанки [Since they weren't their own kind but grekósy, they grabbed all the goats, however much the dark-eyed Greek women hollered].” The kicker comes a couple of paragraphs later, when they enter a Russian village: “и хоть и жалко было, ну, да ведь свои – и позабрали всех кур, гусей, уток под вой и причитанье баб [and even though they felt sorry for them – after all, they were their own kind – they grabbed all the chickens, geese, and ducks amid the howling and lamentation of the women].” Serafimovich knew humankind pretty well.

He also had my attitude toward landscape. A couple of pages earlier he mentions that the straggling column was passing the remnants of old Circassian villages, and says:

Лет семьдесят назад царское правительство выгнало черкесов в Турцию. С тех пор дремуче заросли тропинки, одичали черкесские сады, на сотни верст распростерлась голодная горная пустыня, жилье зверя.

Seventy years earlier the tsarist government had expelled the Circassians to Turkey. Since then, the backwoods paths had been overgrown, the Circassian gardens had gone wild, for hundreds of versts there spread a hungry mountain wilderness, the abode of beasts.

But when the column relaxes by the shore:

И взрывы такого же солнечно-искрящегося смеха, визг, крики, восклицания, живой человеческий гомон, – берег осмыслился.

And the bursts of such sunny-sparkling laughter, yelping, shouts, exclamations, living human hubbub — the shore was given meaning.

I like scenery as much as the next person, but it is indeed humanity that gives it meaning as far as I’m concerned.

How Not to Use Ngrams.

A good piece by Ted Underwood from his blog The Stone and the Shell (“Using large digital libraries to advance literary history”), How not to do things with words:

In recent weeks, journals published two papers purporting to draw broad cultural inferences from Google’s ngram corpus. [...]

I’m writing this post because systems of academic review and communication are failing us in cases like this, and we need to step up our game. Tools like Google’s ngram viewer have created new opportunities, but also new methodological pitfalls. Humanists are aware of those pitfalls, but I think we need to work a bit harder to get the word out to journalists, and to disciplines like psychology.

The basic methodological problem in both articles is that researchers have used present-day patterns of association to define a wordlist that they then take as an index of the fortunes of some concept (morality, individualism, etc) over historical time. [...]

The fallacy involved here has little to do with hot-button issues of quantification. A basic premise of historicism is that human experience gets divided up in different ways in different eras. [...]

The authors of both articles are dimly aware of this problem, but they imagine that it’s something they can dismiss if they’re just conscientious and careful to choose a good list of words. I don’t blame them; they’re not coming from historical disciplines. But one of the things you learn by working in a historical discipline is that our perspective is often limited by history in ways we are unable to anticipate. So if you want to understand what morality meant in 1900, you have to work to reconstruct that concept; it is not going to be intuitively accessible to you, and it cannot be crowdsourced.

There’s much more at the link, and attention must be paid.

Multistory Profanity.

Many years ago I learned from Edward Topol about the Russian system of classifying mat, or profanity, according to the number of layers, or stories/storeys, it contains, the more elaborate having three or even seven levels; I don’t think I’d ever encountered this system in literary use before, but reading Alexander Serafimovich‘s classic of Soviet Civil War literature, «Железный поток» (The Iron Flood, 1924), recommended to me by Sashura back in 2010, I’ve just come across it: “Кожух перестал стрелять и, надсаживаясь, стал выкрикивать трехэтажные матерные ругательства [Kozhukh stopped firing and, straining his voice, began to yell three-story obscene curses].” Reading that phrase was as satisfying to me as I imagine the cursing was to him. (I should add that the sentence I quote is followed by “Это сразу успокоило [That quieted (the mob) at once].”)

Incidentally, the story is an account of the actual march [Russian link] in August-September 1918 of the Taman Army (a branch of the earliest version of the Red Army) south from the Taman Peninsula to escape destruction by White forces, and the dialogue is full of Ukrainian and Ukrainianisms, which makes me glad I studied a bit of the language a while back. The closest analogue I can think of in English would be a story set in the Border region of England with lots of Scots in the dialect.

Peevers in Paradise.

Matt of No-sword has a (cleverly titled) post about some linguistic descriptions he noticed in Margaret Mead’s Coming of age in Samoa; first he points out that when she says the “immaturity” in use of language of a group of girls between ten and twenty years old “was chiefly evidenced by a lack of familiarity with the courtesy language, and by much confusion in the use of the dual and of the inclusive and exclusive pronouns,” what she observed may have been “just conflict between actual spoken Samoan versus some idealized form of the language that she had been taught was correct” — a very acute point. Then he quotes this footnote:

The children of this age already show a very curious example of a phonetic self-consciousness in which they are almost as acute and discriminating as their elders. When the missionaries reduced the language to writing, there was no k in the language, the k positions in other Polynesian dialects being filled in Samoan either with a t or a glottal stop. Soon after the printing of the Bible, and the standardisation of Samoan spelling, greater contact with Tonga introduced the k into the spoken language of Savai’i and Upolu, displacing the t but not replacing the glottal stop. Slowly this intrusive usage spread eastward over Samoa, the missionaries who controlled the schools and the printing press fighting a dogged and losing battle with the less musical k. To-day the t is the sound used in the speech of the educated and in the church, still conventionally retained in all spelling and used in speeches and on occasions demanding formality. The Manu’a children who had never been to the missionary boarding schools, used the k entirely. But they had heard the t in church and at school and were sufficiently conscious of the difference to rebuke me immediately if I slipped into the colloquial k which was their only speech habit, uttering the t sound for perhaps the first time in their lives to illustrate the correct pronunciation from which I, who was ostensibly learning to speak correctly, must not deviate. Such an ability to disassociate the sound used from the sound heard is remarkable in such very young children and indeed remarkable in any person who is not linguistically sophisticated.

Matt says, “I love this. Even in Mead’s tropical idyll, there are peevers.” I would also point out the absurdity of Mead’s “less musical k,” which she seems to take as a self-evident description.