Bezoar.

Victor Mair posts at the Log about a very interesting word I hadn’t given much thought to. A bezoar is “A hard indigestible mass of material, such as hair, plant fibers, or seeds, found in the stomach or intestine of animals, especially ruminants and sometimes humans”; when I first ran across the word decades ago, I had the same experience as Mair: “Inasmuch as I had never heard anyone speak the word, I just made up my own pronunciation, and it consisted of three syllables: beh-zoe-are.” Being me, I probably checked the etymology, saw it was from a French word spelled variously bezahar, bezoar, bezoard, etc., and thought “yup, three syllables, case closed.” But it turns out in English it’s generally pronounced with two: Brit. /ˈbiːzɔː/, U.S. /ˈbiˌzɔr/ (though the OED also gives Brit. /ˈbɛzəʊɑː/, so the trisyllabic pronunciation does exist). Mair links to videos of people who deal with the things professionally saying the word, and it’s definitely /ˈbiˌzɔr/. So that’s one thing I’ve learned today. The OED (an old entry, not updated since 1887) explains:

In 17th cent. English, as in French and Spanish, bezahar, bezaar was reduced to two syllables, bezar, beazar, beazer /ˈbeːzər/, of which the modern pronunciation would be regularly /ˈbiːzə(r)/. The spelling bezoar (for bezaär) appears to be of modern Latin origin; it has influenced the pronunciation given in dictionaries since the end of last century.

The other interesting feature is the etymology. AHD says:

[Middle English bezear, stone used as antidote to poison, probably from Old French bezahar, gastric or intestinal mass used as antidote to poison, from Arabic bāzahr, from Persian pādzahr : pād-, protector (from Avestan pātar-; see pā- in the Appendix of Indo-European roots) + zahr, poison (from Middle Persian; see gwhen- in the Appendix of Indo-European roots).]

In the comments at the Log, martin schwartz says (I’ve taken the liberty of replacing his Ø with an actual theta):

Let’s delve more fully into the etymology of bezoar: ,,, from Arabic … from Middle Persian pādzahr < *pātijanθra- [...] nominalization with lengthening ("vrddhi" ) of initial vowel from adjective *pati-janθra- 'counter[i]ng poison', from preverb *pati- 'counter to, anti-' (nothing to do with √pā 'protect' etc.) and *janθra- 'poison' < √jan 'to kill' ….. The word is attested in Sogdian as */pātžār/; it is attested in a Buddhist text edited by Benveniste in his collection of Sogdian texts

And of course Mair discusses “its odd-sounding Chinese name: niúhuáng 牛黃 (‘cow yellow’).”

Holloway and Hoelun.

Two items that have nothing to do with each other except the nicely chiming names:

1) I learn from Lev Oborin’s roundup (in Russian) of literary news that Julia Bolton Holloway claims to have discovered a manuscript in Dante’s hand. This seems like it would be big news, but Oborin links to a Daily Fail story that I didn’t even bother to click on, and when I googled [holloway dante manuscript] I got only a few hits, all from almost a month ago and with almost no details — this LitHub piece shows an image of the MS but has only three paragraphs of text; it links to a Times (UK) story that I can only read a few paragraphs of, and none of it seems very substantive. Anybody know anything about this?

2) Oborin quotes some lines by the poet Irina Kotova that refer to “великая оэлун” [great Oelun] (the poem uses no capital letters). I looked up Оэлун and discovered she was the mother of Temujin (Genghis Khan); the Russian represents Mongolian Өэлүн. The thing is that in English, and in almost all other languages shown in the Wikipedia sidebar, the name starts with H: Hoelun, Hö’elün, Höelün, Höelin, etc. What’s going on here?

Wag the Dog.

I’ve known the expression “the tail wagging the dog” for as long as I can remember, but I had no idea it had a specific origin; Dave Wilton explains it at Wordorigins.org:

The tail wagging the dog is a metaphorical expression for a minor part directing the actions of the whole. The metaphor is rather obvious, but unlike many such expressions, this one has a definitive origin. It comes from Tom Taylor’s play Our American Cousin, which was first performed in New York on 15 October 1858. The play was enormously popular in its day. So that it gave birth to a popular expression should be no surprise. But today the play is chiefly remembered for being the one that Abraham Lincoln was watching at Ford’s Theater when he was assassinated on 14 April 1865.

The relevant scene in the play goes as follows, a conversation between the characters of Lord Dundreary and Florence:
[I skip the first part of the exchange — LH.]

Dun Now I’ve got another. Why does a dog waggle his tail!

Flo Upon my word, I never inquired.

Dun Because the tail can’t waggle the dog. Ha! ha!

The metaphorical expression, with waggle shortened to wag, appears in print within five years of the play’s premiere. From the Milwaukee Daily Sentinel of 15 August 1863:

[Read more…]

Nineteen Years of Languagehat.

I’m shaking my head in disbelief as I write this: nineteen years! It started out as a goofy experiment — I was leaving comments on blogs, and bloggers were saying hey, why not blog, so I blogged — and I’ve kept soldiering on, even as almost all those early compadres have vanished into the mist, some to Facebook and some into the unknown (though I believe Squiffy-Marie “Des” von Bladet is still piginawigging away at Diaryland, wherever that is). As I do every year, I thank all those who drop by and participate in the ongoing conversation, without which the blog would be pointless (I have no desire to pontificate into thin air). I may well have learned more here than in grad school (I certainly remember more of what I’ve learned here), and I have no intention of stopping: next year (inshallah) the vicennial post!

Since I’m not planning to write separately about Trifonov’s posthumous novel Время и место [Time and place], I’ll just mention here that it was a great disappointment: considerably longer than the short novels that made his name, it is told by a self-centered narrator who talks about a self-centered writer who tries to write a novel about a writer writing a novel about a writer… Trifonov has fun mocking the self-reflexivity of it all, but the fact is that it doesn’t work, and in a way that makes me wonder about his place in the literary scheme of things. I get the impression he doesn’t care about people themselves so much as the moral quandaries people find themselves in, so he keeps introducing batches of new characters who get themselves into scrapes involving parents, lovers, teachers, and publishers; the situations might be interesting, but he gives us no reason to care about the people involved, so it becomes more and more tedious as the novel goes on. I was glad when it was over.

And for lagniappe, here’s a paragraph that tickled me from Lauren Collins’ New Yorker “Paris Postcard” interview with the French actress Camille Cottin:

Cottin spoke with a light British accent, a legacy of living in London as a teen-ager. After high school, she studied American and English literature at the Sorbonne; her thesis was on “Harry Potter.” She also taught English to teen-agers. “I was terrible,” she said. “I had all the seventeen-year-olds who were completely high on pot, so no one would ever answer any of my questions. It was like forty red-eyed rabbits just staring at me.” She added, “I didn’t want to say if I didn’t know something, because I would lose my credibility, so I started inventing words. One day, a girl says, ‘How do we say chirurgie esthétique?’” Cottin was stumped. “So I go, ‘Surgical aestheticism.’” She went home and looked it up in the dictionary, and the next day said to the student, “What I told you is the American way, but the English way is ‘plastic surgery.’”

C’est ça l’astuce! OK, time to go, à demain, same time same station, pip pip cheerio, don’t take any wooden nickels, and always look on the bright side of life!

Digital Dostoevsky.

At Bloggers Karamazov (The Official Blog of The North American Dostoevsky Society), Kate Holland posts about a promising project:

Digital Dostoevsky is a computational text analysis project on a corpus of 5 novels and two novellas by Fyodor Dostoevsky. It is a digital humanities project which emerges out of our long-standing interest in traditional philological analysis. We are excited by how digital approaches such as TEI encoding, machine reading, and natural language processing can help to answer questions about the deep structure of Dostoevsky’s novels, questions about speech, character, space, temporality, affect, and fictionality, among other areas. […]

Computational text analysis has flourished in the last few years and many 19th-century writers now have their own digital editions and digital archives. In the Russian context, computational text analysis seems like a natural fit, since Russian scholarship has a long tradition of textology; academic editions of canonical Russian works were produced with painstaking care by teams of editors throughout the Soviet period and beyond. Russia also has a strong tradition of computational methods in linguistics. The research questions which motivate our project are the same ones which scholars have been asking about Dostoevsky’s works for decades. Machine reading opens up possibilities for examining Dostoevsky’s corpus using technologies which neither the Formalists nor Bakhtin had at their disposal. Dostoevsky’s works are already available online. There is a wonderful digital edition of Dostoevsky’s Complete Works based at Petrozavodsk State University in Karelia here. This edition includes a digital concordance that can be used to parse the corpus. […]

Our plain text corpus documents are taken from the canonical Soviet Academy of Sciences 30-volume edition of the Complete Works of Dostoevsky. We stripped the texts of their commentary and converted them to plain text files. So far, our corpus consists of five novels and two novellas: The Double, Notes from Underground, Crime and Punishment, The Idiot, Demons, The Adolescent, and The Brothers Karamazov. We may eventually add to them with the rest of Dostoevsky’s works, as well as adding translations in English and possibly even French.

We are in the process of XML tagging our corpus using TEI (click here to find out more about this methodology). So far, we’ve manually tagged The Double (Dvoinik). We started with basic TEI tagging (paragraphs, speech, named entities), and have moved on to places, direct and indirect speech, addresser and addressees, and liminal spaces and states. […]

The Digital Dostoevsky project can be found on the website digitaldostoevsky.com. We will be blogging as we go along, so check out our website and subscribe to get our updates!

I know that simply being able to search online texts has deepened my understanding of many works and authors, and I imagine this sort of tagging will enable a lot of useful research.

The Origin and Evolution of Word Order.

OP Tipping at Wordorigins.org posted Murray Gell-Mann and Merritt Ruhlen’s 2011 PNAS article The origin and evolution of word order, whose abstract reads:

Recent work in comparative linguistics suggests that all, or almost all, attested human languages may derive from a single earlier language. If that is so, then this language—like nearly all extant languages—most likely had a basic ordering of the subject (S), verb (V), and object (O) in a declarative sentence of the type “the man (S) killed (V) the bear (O).” When one compares the distribution of the existing structural types with the putative phylogenetic tree of human languages, four conclusions may be drawn. (i) The word order in the ancestral language was SOV. (ii) Except for cases of diffusion, the direction of syntactic change, when it occurs, has been for the most part SOV > SVO and, beyond that, SVO > VSO/VOS with a subsequent reversion to SVO occurring occasionally. Reversion to SOV occurs only through diffusion. (iii) Diffusion, although important, is not the dominant process in the evolution of word order. (iv) The two extremely rare word orders (OVS and OSV) derive directly from SOV.

OP says “my question is, is the basic premise correct? Are there no known cases of SOV evolving from other languages?” I responded:

I’m morally certain it’s bullshit, like all statements about “the first human language,” but I’ll post it at LH and see what people have to say.

So: what say you?

Katakana Parade of Nations.

A timely post by Joel at Far Outliers:

I don’t remember how Japan ordered the Parade of Nations when it hosted the Olympics in 1964 (when I was in high school there), but this year the nations were ordered according to how their Japanese names sounded in katakana, the Japanese syllabary used to render foreign names. A full list of the nations in Japanese order can be found in the NPR report about the parade.

Katakana order was used even when names contained kanji (Chinese characters). So Equatorial Guinea (赤道ギニア Sekidou Ginia, lit. ‘Redroad [=equator] Guinea’) appeared between Seychelles (セーシェル) and Senegal (セネガル) because they all start with the sound SE, written セ in katakana.

I wondered about that while watching the coverage; of course I knew the general principle, but I wanted to know the details without going to any trouble, so this is perfect. The whole thing is interesting, but perhaps especially so are these paragraphs about consonants with diacritics:
[Read more…]

Another Language Challenge.

Norbert Wierzbicki has posted another Guess The Language Challenge video; this one features Julie Maksimova, a Latvian language lover. She was very good (if perhaps excessively tentative) and got all the answers right, and this time he shows the texts in written form afterwards, which is great. I got all the answers right (large Cup of Satisfaction!); I got 1, 2, 3, and 6 easily, 4 with the help of the map Julie requested, and 5 only by the same kind of desperate guessing she used. These things are tremendous fun, and if you enjoyed the last one you will certainly like this. After people have had a chance to watch it and make their own guesses, I will add an interesting fact about one of the words he mentions from the first language.

Paisa.

Alexander Jabbari, an assistant professor of Persian language and literature, examines the spread of a word I personally hadn’t given much thought to:

[…] But one currency that expresses the shared past of an entire continent is the Omani rial, which is divided into baisa, a Hindi/Urdu loanword (paisa) with roots in Sanskrit. The word paisa, in fact, is spread all over the Indian Ocean world, from Myanmar to Mauritius and nearly everywhere in between.

I first noticed this word while reading the Egyptian author Sonallah Ibrahim’s novel Warda, about the Dhofar Rebellion in Oman. It’s more common to find Arabic loanwords in Hindi and Urdu than the reverse; Hindustani – the umbrella term linguists use to cover both Hindi and Urdu – uses thousands of Arabic words, which entered the language through Persian. These include everyday vocabulary like insan (person) or dunya (world).

But a word like baisa in Omani Arabic is no surprise. Like many Arabic dialects of the Gulf, it features loanwords from Hindustani, Persian, English, Portuguese, Swahili and other languages, revealing the linguistic routes of the Indian Ocean. In this case, paisa became baisa because Arabic generally lacks the “p” sound. […]

[Read more…]

Guess The Language.

Another language quiz, this time thanks to Norbert Wierzbicki, who posts Guess The Language Challenge videos to YouTube; this one is twenty minutes long and features Raphael Turrigiano, an American studying linguistics in Scotland. Raphael was great, guessing all six (twice with the help of clues) and winning the large Cup of Satisfaction. Me, I got four of the six (thus winning the small Cup of Satisfaction) — 1 and 4 instantly, 6 by the end of the sample, and 3 with the help of the “fact” (which is how Raphael got it as well); for 5 I was close but no cigar, and with 2 I hadn’t a clue (which was kind of embarrassing once I heard the answer; I would have gotten it in written form, but clearly I’ve never heard it spoken). It’s a great quiz, in that the languages are all fair game (no obscure little languages) and the samples are long enough to give you a fighting chance; if this is the kind of thing you like, you will definitely like it. A tip of the Language Hat to villanousbead, who posted it at Wordorigins.org.