Multilingual Parallel Bible Corpus.

This is excellent:

Here you can find a multilingual parallel corpus created from translations of the Bible. This an effort to create a parallel corpus containing as many languages as possible that could be used for a number of NLP tasks. Using the Book, Chapter and Verse indices the corpus is aligned (almost) at a sentence level. (There are cases where two verses in one language are translated as one in another)

Following a similar effort by Philip Resnik and Mari Broman Olsen at the University of Maryland (website) I have encoded the text of each language in XML files using the Corpus Encoding Standard

The following table contains the XML Bibles in 100 languages (all the languages that an electronic version was freely available online) along with information about each language from Ethnologue.

Another gem from bulbul’s Facebook feed!

Russian Man Appointed Irish Language Officer.

Seán Mac an tSíthigh reports on a delightful development:

A Russian man has been appointed as an Irish language officer in a Kerry Gaeltacht and will spearhead attempts to revive the language there.

Victor Bayda, a native of Moscow, has taken up the post with Comhchoiste Ghaeltacht Uíbh Ráthaigh, a community organisation in the south Kerry Gaeltacht of Uíbh Ráthach. Mr Bayda is fluent in Irish and has been teaching the language in a Moscow university for more than ten years. He speaks up to ten languages including Dutch, Scots Gaelic, Welsh, Swedish, French, German and Icelandic. He was awarded a PhD for a thesis that dealt with aspects of the Irish language.

Mr Bayda, who made the journey from Moscow to Kerry at the weekend, says he began learning the language while attending university, but that he was aware of its existence as a young teenager. He said: “I have had an interest in languages since I was 13, especially the Celtic languages. I had learned some Welsh and Scots Gaelic by the time I went to university. “It was then I discovered that Irish was available and I signed up for the course. I also picked up a lot of Irish from listening to the language on TG4 and Raidió na Gaeltachta. Then I had an opportunity to study in Trinity College where I first heard Irish as a living, breathing language.” […]

Mícheál Ó Leidhin of Comhchoiste Ghaeltacht Uíbh Ráthaigh says he understands why the appointment of a Russian as a language planner in a Kerry Gaeltacht might raise a few eyebrows, but that the committee is delighted that Mr Bayda accepted the offer. Mr Ó Leidhin said: “Victor is one of the finest Irish speakers you’ll ever meet. Completely fluent. He is highly qualified and possesses tremendous expertise in the whole area of language planning. These are skills we badly need in this area if Irish as a community language is to be saved. We are confident that we have found the right man.”

Of course, he might fit in easier if he changed the spelling of his surname to, say, Baighdagh, or at least Baída.

Accepting Madame la ministre.

Henry Samuel reports on a slight budging on the part of the preservers of French linguistic tradition:

For centuries, members of the hallowed Académie Française – created in 1635 to “fix the French language, giving it rules, rendering it pure and comprehensible by all” – had refused to accept that words such as “professeur” (teacher) or ingénieur (engineer) be made “professeure” or “ingénieure” for women.

“The Immortals”, as académiciens are known, had repeatedly argued that to add an “e” to such male titles would “end up with proposals that are contrary to the spirit of the language”.

The cause appeared lost when Hélène Carrère d’Encausse became the Académie’s first ever female perpetual secretary in 1999 and announced she would be referred to as “Madame le secrétaire perpetuel”, in the masculine form. She also opposed “la ministre” (a female minister), preferring “Madame le ministre”. The argument was that gender had nothing to do with job title.

But the institution, which has faced recent accusations of linguistic sexism, has changed tack after placing its entire dictionary online for the first time this month.

Since then, Ms Carrère d’Encaisse has already given some ground, telling Le Figaro: “There are things that enter usage, such as ‘Madame la ministre’. ‘La ministre’ is not a problem.” However, she said she drew the line at “écrivaine” (a female writer) on the grounds that “it’s very ugly”.

But according to l’Express, the Académie will announce on February 28 its intention to include “feminised” versions of such occupations alongside the longstanding masculine nouns.

They even quote an actual linguist, Bernard Cerquiglini, to the effect that the Académie”s position had become “untenable.” (Thanks, Martin!)

Texting in Novels.

Jemma Slingo has an interesting piece in Prospect, beginning with the observation that “Our lives are filled with texts, emails and instant messages […] It is strange, therefore, that novelists—who deal in dialogue and social drama—are on the whole not paying more attention to this new method of communication.”

This is not the case in all new writing. Sally Rooney embeds online chat in her prose to great effect, as does Ben Lerner in his debut novel Leaving the Atocha Station. Elif Batuman’s The Idiot, set in the mid-90s, spotlights the weirdness of email, and Olivia Laing’s Crudo satirises our newfound obsession with screens.

Even these novels, however, reveal—deliberately or otherwise—how difficult it is to integrate text talk in a piece of fiction.

What is it about electronic utterances, then, that makes them so troublesome for novelists? Why are they a problem to be solved? […]

When a character talks to someone face-to-face or over the phone, novelists are free to imagine their tone of voice, accent, gestures, emphasis and body language. Spoken exchanges can be imbued with richness and texture. But when characters chat via screen, all they do is press “send,” leaving no room for authorial embellishment. The dialogue just lies on the page like a film script. […]

[Read more…]

Robert Willig, RIP.

Right after my wife and I moved to this neck of the woods, I made the rounds of the local bookstores, and my hands-down favorite was Troubadour Books, then in North Hatfield, just across the river; I wrote about my first visit back in 2007, featuring its generous and knowledgeable owner, Bob Willig. A few years ago I posted that Sam Burton of Grey Matter Books would be running the store (now in Hadley), since “Willig is blind and has been in bad health.” And this morning I was saddened to read his obituary in the local paper:

Bob was an expert on football, baseball, basketball, jazz, blues, barbeque, classic Jewish humor, Medieval philosophy, and the Beats. He was also a harmonica virtuoso, bibliophile, gourmet, bon vivant, raconteur, political radical, anarchist, world traveler, and even once studied to be a clown. […] He attended New York University, the University of Oregon and graduated from George Washington University with a degree in Comparative Religion and a minor in cinematography.

When Bob opened his bookstore, Troubadour Books (for Scholars and Holy Fools) in 1995 in North Hatfield, Massachusetts he built a collection that attracted scholars and buyers from California to England. He was famous for his book sales and always extended generous discounts to any who asked, with maybe a shot of Maker’s Mark bourbon from below the counter, just for good measure. […]

From his voracious reading and book collecting he built up a huge personal library reflecting his many interests from Dante to Ginsberg and from Dario Argento to Alfred Hitchcock. When his groaning shelves could hold no more it was time to open his own bookshop and he used his collection as the basis for his legendary bookshop, Troubadour Books.

Troubadour was more of an open house or salon than a typical store. Bob and Toni were always ready to sit and talk and share stories. Their friends were always dropping by. How they managed to run such a neat and tidy bookshop that was bursting at the seams at the same time was a wonder. They made it look easy, as if they were born to this life. When Bob began to go blind in 2012 Sam Burton purchased and merged Troubadour with his shop, Grey Matter, in Hadley, MA. His shop is still open and thriving and is a worthy successor to Bob’s legacy.

I personally wouldn’t have called it “neat and tidy,” but it was wonderful, and of all the bookstore owners I’ve known he may have been the very best. Alevasholem.

The British-Irish Dialect Quiz.

Courtesy of the New York Times, this quiz by Josh Katz is a lot of fun and it’s only 25 questions. Before you start, they give you two options: “I was raised in Ireland or the U.K.” and “I wasn’t raised there, but I want to play anyway!” I got the following accurate result: “Definitely not from around here are you? Your answers were closer to the average person outside of Ireland and Britain than anywhere inside it.” Katz says:

Constructing this quiz involved consulting previous research from linguistic experts in and around Britain and Ireland. The Survey of English Dialects, the BBC Voices project and several books on English linguistics — particularly Language in the British Isles and Studies in Linguistic Geography — proved especially useful.

Thanks, Eric and jack!

Two from bulbul.

Or, more specifically, from his Facebook feed:

1) Sanna: A language written for the first time, a video report by Nikolia Apostolou of BBC Travel. Note that bulbul objects to the tagline:

Aaaagh. No, it’s not a “mix” of Arabic and Aramaic; there are some traces of Aramaic in Sanna (or Cypriot Maronite Arabic, as scholarly literature refers to it), but those have been there since the language was brought to Cyprus from the Levant. Cypriot Maronite Arabic is a variety of Arabic that has been in intensive contact with Greek, much in the same way Maltese is a variety of Arabic that has been in intensive contact with Sicilian and Italian.
Still a cool report, though.

2) 10,000 Arabic books have been digitized to ebooks, by Michael Kozlowski:

The Arabic Collections Online project has just formed and they are making available 10,000 Arabic ebooks across 6,000 subjects for free. […] This mass digitization project aims to feature up to 23,000 volumes from the library collections of NYU and partner institutions. These institutions are contributing published books in all fields—literature, business, science, and more—from their Arabic language collections.

The mission statement behind the ACO aims to digitize, preserve, and provide free open access to a wide variety of Arabic language ebooks in subjects such as literature, philosophy, law, religion, and more. Important Arabic language content is not widely available on the web, and ACO aims to ensure global access to a rich Arabic library collection. Many older Arabic books are out-of-print, in fragile condition, and are otherwise rare materials that are in danger of being lost. ACO will ensure that this content will be saved digitally for future generations.

The ebooks can be read online through any major internet browser and also available by high resolution PDF files. This makes each book available to be read on any e-reader and can be sideloaded from your PC directly to your device.

I approve.

Narwhal.

I always thought a narwhal was, etymologically, a “corpse whale”; if you’re not familiar with that etymology, Stan at Sentence first provides links to several essentially identical derivations from Old Norse náhvalr. But he complicates the picture by mentioning other “speculative origin stories,” and quotes the OED’s comprehensive etymology:

Probably < Danish narhval, cognate with Norwegian narkval, Swedish narval (1754), and further cognate with Old Icelandic náhvalr < a first element of uncertain origin (perhaps < nár corpse: see need n., with reference to the colour of the animal’s skin; or perhaps shortened < nál needle n., with reference to the straight tusk) + hvalr whale n.; the epenthetic –r– in the Norwegian, Swedish, and Danish forms has not been satisfactorily explained (see note below). Compare Middle French nahual (1598; French †narhual (1647), †narwal (1676), narval (1723)), Spanish narval (1706), Italian narvalo (1745), Dutch narwal (1769), German Narwal (18th cent.), all ultimately borrowings from Scandinavian.

Alternative etymologies connect the first element with the Germanic base of either nase n. or narrow adj.; both of these suggestions assume that forms with –r– are primary, and that forms without –r– (the earliest attested forms) are alterations by folk etymology, after Old Icelandic nár corpse.

The Russian word is, unsurprisingly, нарвал [narval].

The Spleendrake.

I’m getting to the end of Bunin’s Деревня [The Village], which made him famous in Russia when it was published in 1910; I almost gave up on it because the first part, about the greedy, brutal Tikhon Krasov, was so depressing (it reminded me of Grigorovich’s 1847 Антон Горемыка [Unlucky Anton], another life-sucks-and-everybody-suffers story), but the second part, about his poetry-loving brother Kuzma, was a little less gloomy, so I kept going. It’s not easy reading, being full of specialized and dialectal words and expressions, so I have to keep checking Dahl and other references, and occasionally I’ll consult the 1923 translation by Hapgood. She can be helpful, but there’s a reason I called her “the hapless Isabel F. Hapgood” back in 2017, and I’ve come to a passage (beginning in the Russian text “Чтобы согреться, он выпил водки и посидел перед жарко пылающей печкой” [Hapgood: “With a view to warming himself up, Kuzma drank some vodka and seated himself in front of the hotly flaming oven”]) that contains two howlers in succession. The first amused me but didn’t drive me to post: the hut contains an image of продажа братьями Иосифа [the sale of Joseph by his brothers], which is translated “manufactured by the Josif Brothers.” But then a couple of sentences later Kuzma decides to go see Tikhon, and Bunin says his gelding ran quickly, екая селезенкой, which Hapgood renders “emitting roaring and quacking sounds, like a drake”! Now, the funny-sounding verb ёкать [yokat’] can mean several things, mainly ‘to emit abrupt hiccup-like sounds’ or (of a heart) ‘to skip a beat,’ but never “roaring and quacking,” and селезёнка [selezyonka], though it looks very similar to селезень [selezen’] ‘drake,’ is an entirely different word meaning ‘spleen’ (and is in fact probably a cognate of Greek σπλήν, from which we get spleen). Had she bothered to consult Dahl, Hapgood would have learned that у лошади селезенка бьется [the horse’s spleen is beating], of which this is clearly a variant, means “на бегу жидкость пахтается вслух в желудке” [while running, liquid is audibly churning in its stomach]. Translation is hard, and I don’t want to be too hard on Isabel, but you should realize that something has gone wrong when you find yourself writing about a horse emitting roaring and quacking sounds, like a drake.

The Letters of Flaubert and Turgenev.

Back in 1985 the NY Times published a pretty extensive selection from the correspondence of Flaubert and Turgenev, and I recommend it to one and all; it’s full of gossip, amusing remarks, and insights. Some snippets:

Turgenev from Paris to Flaubert on Nov. 24, 1868: […] P.P.S. Find another title. ”Sentimental Education” is wrong.

Flaubert from Paris to Turgenev on Nov. 27, 1878: I have just turned 60. This is the start of the tail end of life. A Spanish proverb says that the tail is the hardest part to flay. At the same time it’s the part that gives least pleasure and satisfaction.

Flaubert from Croisset to Turgenev on Jan. 21, 1880: Thank you for making me read Tolstoy’s novel [War and Peace] . It’s first-rate. What a painter and what a psychologist! The first two [volumes] are sublime; but the third goes terribly to pieces. He repeats himself and he philosophizes! In fact the man, the author, the Russian are visible, whereas up until then one had seen only Nature and Humanity. It seems to me that in places he has some elements of Shakespeare. I uttered cries of admiration during my reading of it . . . and it’s long! Tell me about the author. Is it his first book? In any case he has his head well screwed on! Yes! It’s very good! Very good!

Very good indeed. (Thanks, Steven!)