Quantitative Methods in Historical Linguistics.

Barbara McGillivray and Gard B. Jenset, authors of Quantitative Historical Linguistics: A Corpus Framework, summarize some of their ideas for OUPblog:

Linguistics generally has seen an increase in the use of corpora and quantitative methods over the recent years. Yet journal publications in historical linguistics are less likely to use such methods. Part of the explanation is no doubt the advantage that linguistics for extant languages holds regarding greater availability of annotated text corpora and people who can answer questionnaires or take part in experiments. Yet this can only be part of the explanation. […]

It is reasonable to look to cultural explanations for this. After all, the technical barriers keep getting lower and the availability of resources keep increasing. So what is special about historical linguistics? For one thing, historical linguistics (at least if we consider the historical-comparative method) has a very long, very stable, and very successful history. The methodological core of the historical-comparative method has proved remarkably stable over time.

Furthermore, there is a history of failed attempts at using quantitative methods in historical linguistics. In some cases, such techniques have been tested and simply failed to work, as one would expect in any scientific endeavour. In other cases, the lack of extensive quantitative modelling by historical linguists have enticed scholars from other fields, with experience in statistical models, to step in and fill that gap. These endeavours have met with mixed reactions from mainstream historical linguistics.

What seems to be missing is a positive case for using quantitative methods in historical linguistics, on the premises of historical linguistics. That, in our view, is the only way that quantitative techniques can properly cross the chasm into adoption in mainstream historical linguistics. Such a positive case must go well beyond training manuals or statistics classes. Instead, the intellectual footwork for integrating numbers with the core questions that historical linguistics faces must be done.

It’s certainly true about the failed attempts; I’d love to see the positive case they suggest. If well done, quantitative techniques could surely help.

Words Where You Are.

The OED has an appeal I want to help spread:

How we speak can reveal where we are from: not just our accent, but the language we use. Words and phrases particular to a city, region, or country are a distinctive part of English, and we at the OED are asking you to help us identify and record them.

Most of us have experience of using a familiar term in unfamiliar circumstances and being met with a blank stare. Many of us can recall a moment when a word we’ve known and used for years at home turns out to be baffling to people from other parts of our own country, or from another English-speaking region. If a picture is hanging askew, would you say that it is agley, catawampous, antigodlin, or ahoo? At the beach, do you wear flip-flops – or would you refer to them as zoris, jandals, or slipslops? Would you call a loved one your doy, pet, dou-dou, bubele, alanna, or your babber? Many such words are common in speech, but some are rarely written down, so they can easily escape the attention of dictionary editors.

Whether you’re in Manchester, Mumbai, Manila, or Massachusetts, the OED would like to hear from you. Please use the form below to tell us about the words and expressions which are distinctive to where you live or where you are from. We’re looking forward to reading your suggestions. You can also join the conversation on Twitter with the hashtag #wordswhereyouare

My wife and I have used ahoo ever since reading the Aubrey/Maturin novels. I got the link from Vesihiisi’s MetaFilter post, where one commenter correctly points out that “The challenge is knowing what perfectly ordinary words you use in your everyday life are actually ‘regionally distinctive words'” and another praises the Southern US word “tump” (“When something tumps, it doesn’t just dump over. There’s a moment of precariousness, in which you hope desperately that the object in mid-tump might right itself and settle back down, but nope, nope, over it goes. Also, there’s a really weird unspoken context that matters. Boats capsize; canoes tump. Tricycles can tump, but bicycles cannot”). So send ’em your own!

Google Translate Flunks in Court.

Devin Coldewey reports for Techcrunch on an interesting legal ruling in the case of Omar Cruz-Zamora, who was pulled over by cops in Kansas and found to be in possession of drugs:

Cruz-Zamora doesn’t speak English well, so the consent to search the car was obtained via an exchange facilitated by Google Translate — an exchange that the court found was insufficiently accurate to constitute consent given “freely and intelligently.” […]

For example, the officer asked “¿Puedo buscar el auto?” — the literal meaning of which is closer to “can I find the car,” not “can I search the car.” (Note: these translations were what were put forth in the case, not my own — I don’t speak Spanish. As commenters below note, it’s more like “can I search for the car,” which is very different.) There’s no evidence that Cruz-Zamora made the connection between this “literal but nonsensical” translation and the real question of whether he consented to a search, let alone whether he understood that he had a choice at all.

With consent invalidated, the search of the car is rendered unconstitutional, and the charges against Cruz-Zamora are suppressed. […]

Providers of machine translation services would have us all believe that those translations are accurate enough to use in most cases, and that in a few years they will replace human translators in all but the most demanding situations. This case suggests that machine translation can fail even the most basic tests, and as long as that possibility remains, we have to maintain a healthy skepticism.

I’m not qualified to comment on the legal issues, but as Languagehat I thoroughly approve of the decision from a linguistic point of view, and of Coldewey’s insistence on the need for “a healthy skepticism.” Thanks, Kobi!

Slovene Dialects.

Joel of Far Outliers has posted another excerpt from Lingo: Around Europe in Sixty Languages, by Gaston Dorren (see this LH post), this time about Slavic dialects, or rather the lack of significant dialects in all languages but one:

Whether they’re from the Baltic port of Kaliningrad or from Vladivostok on the Sea of Japan, there’s little difference in the way Russians speak. In Poland, the same holds true: North Poles and South Poles can chat away effortlessly to each other, as can West and East Poles. Even people speaking different Slavic languages can often communicate without much trouble. Bulgarians can converse with Macedonians, Czechs with Slovaks, and Russians with Belarusians and Ukrainians. And, for all their political differences, there is no great language barrier between Croats, Bosnians, Serbs and Montenegrins. In fact, as the eminent nineteenth-century Slovak scholar Ján Kollár suggested, the Slavic world could, with no great effort on the part of its citizens, adopt just four standard languages: Russian, Polish, Czechoslovak and, lastly, what you might call Yugoslav or South Slavic.

There is one language, however, that wouldn’t so easily be absorbed into Kollár’s scheme: Slovene, also known as Slovenian. Admittedly, this is the language of a very small nation. Its entire territory fits no fewer than twelve times into the area of the UK (which is itself not large) and the population, at just over two million, is just a quarter of that of London. And yet, when Slovenes speak their local dialects, many of their compatriots can make neither head nor tail of what they are saying. So just imagine how these dialects would bewilder the members of some of the other nations that Kollár lumped together as ‘South Slavic’, such as the Bulgarians.

How come? Why does Russian span more than four thousand miles from west to east with next to nothing in the way of dialect diversity, whereas the Slovene language area, measuring just two hundred miles from end to end, is a veritable smorgasbord of regional varieties?

A good question!

How to Ask for a Drink in Subanun.

I was looking for something else (as I usually am) when I found my old copy of The Pleasures of Anthropology, edited by Morris Freilich; I opened it curiously and realized it’s one of the many books I bought because it looked interesting and never got around to reading. Naturally I turned to the section Human Communication, and was immediately drawn to “How to Ask for a Drink in Subanun,” by Charles O. Frake (American Anthropologist 66.6, Part 2 [Dec. 1964]: 127-132); the first couple of paragraphs are thought-provoking enough I thought I’d reproduce them here:

WARD GOODENOUGH (1957) has proposed that a description of a culture — an ethnography — should properly specify what it is that a stranger to a society would have to know in order appropriately to perform any role in any scene staged by the society. If an ethnographer of Subanun culture were to take this notion seriously, one of the most crucial sets of instructions to provide would be that specifying how to ask for a drink. Anyone who cannot perform this operation successfully will be automatically excluded from the stage upon which some of the most dramatic scenes of Subanun life are performed.

To ask appropriately for a drink among the Subanun it is not enough to know how to construct a grammatical utterance in Subanun translatable in English as a request for a drink. Rendering such an utterance might elicit praise for one’s fluency in Subanun, but it probably would not get one a drink. To speak appropriately it is not enough to speak grammatically or even sensibly (in fact some speech settings may require the uttering of nonsense as is the case with the semantic-reversal type of speech play common in the Philippines. See Conklin 1959). Our stranger requires more than a grammar and a lexicon; he needs what Hymes (1962) has called an ethnography of speaking: a specification of what kinds of things to say in what message forms to what kinds of people in what kinds of situations. Of course an ethnography of speaking cannot provide rules specifying exactly what message to select in a given situation. If messages were perfectly predictable from a knowledge of the culture, there would be little point in saying anything. But when a person selects a message, he does so from a set of appropriate alternatives. The task of an ethnographer of speaking is to specify what the appropriate alternatives are in a given situation and what the consequences are of selecting one alternative over another.

Ward Goodenough was an important anthropologist; I note with bemusement that that Wikipedia article gives one pronunciation in IPA and a different one in the respelling of his (superb) surname, and someone who knows which is correct should fix it. The Subanon (Wikipedia’s preferred spelling) live in the Zamboanga peninsula area of Mindanao Island, Philippines; they speak, obviously, the Subanon language. And Stan Carey’s recent post The Speech Community seems relevant.

Dogs and the koryos.

Eric A. Powell, online editor at Archeology, reports on an attempt to combine archaeology and historical linguistics:

Around 4,000 years ago, on the steppes north of the Black Sea, a nomadic people began settling down in small communities. Known today as the Timber Grave Culture, these people left behind more than 1,000 sites. One of them is called Krasnosamarskoe, and Hartwick College archaeologist David Anthony had big expectations for it when he started digging there in the late 1990s. Anthony hoped that by excavating the site he might learn why people in this region first began to establish permanent households. But he and his team have since discovered that Krasnosamarskoe has a much different story to tell. They found that the site held the remains of dozens of butchered dogs and wolves—vastly more than at any comparable site.

Anthony and his wife, archaeologist Dorcas Brown, who are interested in combining linguistic and mythological evidence with archaeological evidence, searched the literature on Indo-European ceremonies, and Brown “found that historical linguists and mythologists have long linked dog sacrifice to an important ancient Indo-European tradition, the roving youthful war band”:
[Read more…]

The Bookshelf: Kolyma Stories.

New York Review Books has sent me a review copy of Kolyma Stories by Varlam Shalamov, newly translated by Donald Rayfield. The publisher’s page accurately calls the collection “a masterpiece of twentieth-century literature”; Solzhenitsyn himself famously wrote “Shalamov’s experience in the camps was longer and more bitter than my own…I respectfully confess that to him and not me it was given to touch those depths of bestiality and despair toward which life in the camps dragged us all,” and if you have any interest in the Gulag, in how people react to the extremes of experience, or simply in great writing, you should try Shalamov, and this is part of the first complete edition in English: “This is the first of two volumes (the second to appear in 2019) that together will constitute the first complete English translation of Shalamov’s stories and the only one to be based on the authorized Russian text.” In short, this is a must-read.

Having said that, I’ll discuss the translation, comparing it both to the original and to the previous English version translated by John Glad (Penguin 1980, rev. ed. 1994). The first thing to note is that the Penguin has far fewer stories, and the stories are often cut; whether the cuts are the translator’s choice or reflect the then available Russian text, I don’t know, but it’s an unfortunate fact that emphasizes even more the importance of the new version. Furthermore, Glad makes too many errors and poor choices; in Сухим пайком [“Dry Rations” in Glad, “Field Rations” in Rayfield], for instance, he calls Butyrka “Butyr Prison,” has “fruits” for ягоды ‘berries,’ and renders положительному ‘positive’ as “decent.” When the narrator’s coworker Savelev starts fantasizing about a future neither of them believes in, he begins “Помечтаем,” which Rayfield correctly renders “let’s dream a bit”; Glad has “Just imagine,” which simply doesn’t work in this context. I don’t want to overemphasize this — every translation has errors, and I don’t understand Rayfield’s “standard” for заветный in “заветный мешочек,” which Glad renders “a small cherished bag” — but Glad seems to have more than the necessary minimum.
[Read more…]

Losing Your Language.

A very interesting BBC Future piece on language loss by Sophie Hardach:

“The minute you start learning another language, the two systems start to compete with each other,” says Monika Schmid, a linguist at the University of Essex.

Schmid is a leading researcher of language attrition, a growing field of research that looks at what makes us lose our mother tongue. In children, the phenomenon is somewhat easier to explain since their brains are generally more flexible and adaptable. Until the age of about 12, a person’s language skills are relatively vulnerable to change. Studies on international adoptees have found that even nine-year-olds can almost completely forget their first language when they are removed from their country of birth.

But in adults, the first language is unlikely to disappear entirely except in extreme circumstances.

For example, Schmid analysed the German of elderly German-Jewish wartime refugees in the UK and the US. The main factor that influenced their language skills wasn’t how long they had been abroad or how old they were when they left. It was how much trauma they had experienced as victims of Nazi persecution. Those who left Germany in the early days of the regime, before the worst atrocities, tended to speak better German – despite having been abroad the longest. Those who left later, after the 1938 pogrom known as Reichskristallnacht, tended to speak German with difficulty or not at all. […]

Such dramatic loss is an exception. In most migrants, the native language more or less coexists with the new language. How well that first language is maintained has a lot to do with innate talent: people who are generally good at languages tend to be better at preserving their mother tongue, regardless of how long they have been away.

But native fluency is also strongly linked to how we manage the different languages in our brain. “The fundamental difference between a monolingual and bilingual brain is that when you become bilingual, you have to add some kind of control module that allows you to switch,” Schmid says. […]

Mingling with other native speakers actually can make things worse, since there’s little incentive to stick to one language if you know that both will be understood. The result is often a linguistic hybrid.

There’s lots more good stuff there (like Cubans in Miami starting to speak more like Colombians or Mexicans), and I had personal experience with some of it, like losing my early Japanese when we left the country (though fortunately not with traumatic loss). Thanks, Trevor!

Du Fu and the Old Man of Emei.

Anatoly Vorobey posted a poem by Du Fu (aka Tu Fu), 漫成二首, in the original and five translations, one in Russian (excellent, by Alexander Gitovich) and four in English. I love that sort of comparison; here are my two favorites among the English renderings:

On the Spur of the Moment, II

River slopes, already midmonth of spring;
under the blossoms, bright mornings again.
I look up, eager to watch the birds;
turn my head, answering what I took for a call.
Reading books, I skip the hard parts;
faced with wine, I keep my cup filled.
These days I’ve gotten to know the old man of Emei.
He understand this idleness that is my true nature.

(translated by Burton Watson, The Selected Poems of Du Fu, 2002)

Haphazard Compositions, II

On the river floodplain it is already mid-spring,
under the flowers once again a clear morning.
I raise my face, avid to watch the birds,
I turn my head, mistakenly to answer someone.
When I read, I pass over the hard words,
with ale before me, full pots are frequent.
Recently I’ve gotten to know an old fellow from Emei,
he understands that my indolence is my true nature.

(translated by Stephen Owen, The Poetry of Du Fu, 2016)

The last translation, by Edna Worthley Underwood and Chi Hwang Chu, is horrible; for one thing, Emei (aka O Mei) is one of the Four Sacred Mountains of China (and a favorite name for Sichuanese restaurants), not the name of a hermit, for crying out loud.

Alien Linguistics.

Davide Castelvecchi writes for Nature about a subject dear to the heart of this old science fiction fan, what is sometimes called xenolinguistics:

Sheri Wells-Jensen is fascinated by languages no one has ever heard — those that might be spoken by aliens. Last week, the linguist co-hosted a day-long workshop on this field of research, which sits at the boundary of astrobiology and linguistics.

The meeting, at a conference of the US National Space Society in Los Angeles, California, was organized by Messaging Extraterrestrial Intelligence (METI). METI, which is funded by private donors, organizes the transmission of messages to other star systems. The effort is complementary to SETI (Search for Extraterrestrial Intelligence), which aims to detect messages from alien civilizations.

There follows an interview with Wells-Jensen, which of course touches on the recent movie Arrival (see this LH post), but I want to feature this interesting exchange:

What was discussed at the conference?

The piece that was underpinning everything there, I think, was the extent to which human language is innate. If language has a necessary innate piece, then two civilizations might have a good chance of understanding each other: that was the Chomskian approach represented in some of the papers presented at the conference. Others expressed the sense that third factors — body shape, what your planet is like — would have more to do with language and that little has to be innate. If that is so, then we’d have a better chance of understanding aliens that are similar to us than of understanding those that aren’t.

Needless to say, I am firmly on the anti-Chomsky side. Thanks, Trevor!