Forensic Linguists in the News.

The Dial (the new “online magazine of culture, politics and ideas,” not the Transcendentalist/modernist predecessor that published Yeats and Eliot) has a Language issue with a number of interesting items, of which I will feature Julia Webster Ayuso’s Can a Comma Solve a Crime?: “How forensic linguists use grammar, syntax and vocabulary to help crack cold cases.” After introducing us to “France’s best-known unsolved murder case,” that of four-year-old Grégory Villemin, Webster Ayuso goes into the history of her topic:

According to forensic linguists, we all use language in a uniquely identifiable way that can be as incriminating as a fingerprint. The word “forensic” may suggest a scientist in a protective suit inspecting a crime scene for drops of blood. But a forensic linguist has more in common with Sherlock Ho[l]mes in “A Scandal in Bohemia.” “The man who wrote the note is a German. Do you note the peculiar construction of the sentence?” the detective asks in the 1891 short story. “A Frenchman or a Russian could not have written that. It is the German who is so uncourteous to his verbs.”

The term “forensic linguistics” was likely coined in the 1960s by Jan Svartvik, a Swedish linguist who re-examined the controversial case of Timothy John Evans, a Welshman who was wrongfully accused of murdering his wife and daughter and was convicted and hanged in 1950. Svartvik found that it was unlikely that Evans, who was illiterate, had written the most damning parts of his confession, which had been transcribed by police and likely tampered with. The real murderer was the Evans’ downstairs neighbor, who turned out to be a serial killer.

Today, the field is perhaps still best known for its role in solving the “Unabomber” case in the United States. […] While U.S. authorities hunted down the Unabomber, the field of forensic linguistics was developing in other countries. The University of Birmingham hosted the first British Seminar on Forensic Linguistics in 1992, bringing together academics from Australia, Brazil, Holland, Ukraine, Greece and Germany. Barcelona’s Pompeu Fabra University has had a forensic linguistics laboratory since 1993. But it wasn’t until the next decade that the field became more structured, with the creation of university research teams, master’s degrees and government-funded police laboratories and agencies.

“It’s still emerging in places outside where it initially started, but it is growing gradually as people are getting trained,” said Nicci MacLeod, a senior lecturer at the Aston Institute of Forensic Linguistics in Birmingham, England, which was established in 2019.

She goes on to discuss authorship attribution (“identifying the author of a given text and, in some cases, shedding light on long-standing literary mysteries”); I like this example:

The computational linguists Florian Cafiero and Jean-Baptiste Camps have done similar work in France. For over a century, scholars had argued that Molière could not have written some of his best work due to his lack of education, suggesting instead that his plays had been ghostwritten by the poet Pierre Corneille. The academics were able to disprove this theory by looking at language, rhyme, grammar and word forms. This established a “clear-cut separation” between plays penned by Molière and works by Corneille.

“You thought that Molière wrote his plays? Well… yes, he did,” Cafiero said with a chuckle, when we met at his offices at the École Nationale des Chartes in Paris, a university specialized in historical sciences, to discuss his work. “It was the total opposite of a scoop.”

I can’t stand those snobs who think only people with an elite education can produce great art! (We talked about forensic linguists earlier this year.)

Among the other pieces in the issue are Yásnaya Elena A. Gil’s To Read (translated from Spanish by Ellen Jones — I posted about Gil here), Ross Perlin’s AI Won’t Protect Endangered Languages (we talked about Perlin just the other day), Winthrop Rodgers’ Saving Yazidi Song, and Teresa Grøtan’s Death in Grammar (translated from Norwegian by Caroline Waight [pron. “wight”]), from which I extract this delightful paragraph about Iglesia Maradoniana (the Church of Maradona), which she encountered during a visit to Argentina to improve her Spanish:

In this particular religious community, which today has hundreds of thousands of members, Maradona’s autobiography is the Bible. Their calendar begins with the year of Maradona’s birth, and their Lord’s Prayer starts as follows: “Dear God, who art in football stadiums, hallowed be thy left foot.” The Credo? “I believe in Diego, almighty footballer, creator of magic and passion.” The first of the Ten Commandments? “The ball shall never be sullied.” The firstborn son in each family is christened Diego, and all members take Diego as a middle name. Believers swear on the ball, couples get married at football stadiums — and in this ministry, it was of course the hand of God, not Maradona’s, that scored the notorious goal when Argentina defeated England 1–0 at the World Cup in 1986.

Thanks, Nick!

Comments

  1. David Eddyshaw says

    The Perlin article is spot on.

    “AI” is a classic bubble. Bubbles can do a lot of damage, both in the inflation and the bursting stage. Cory Doctorow has an interesting recent article on whether this one will leave anything worth much after it bursts (the dotcom bubble did, so it can happen.)

    More and more I incline to the view that the loudest voices pushing APEs are deliberate fraudsters rather than self-deluded dupes. They’re recently shown us their anti-democratic political objectives all too clearly, and poisoning the wells of accurate information, and undermining employment by epic plagiarism of the very same employees, neatly align with those objectives.

  2. David Eddyshaw says

    “AI” = stochastic parrot

    “LLM” = very large-scale automated plagiarism engine

    “Hallucination” = essential nature of LLMs/APEs showing through the hype and the disguising kludges

    “Training data” = original human-created material to be plagiarised

    As Doctorow points out, the hype is directly damaging in itself. “AI” does not need to be actually able to do your job adequately for you to end up unemployed. All that is necessary for those seeking to cash in on the bubble is to convince your boss that it can. That’s where the hype comes in.

  3. More and more I incline to the view that the loudest voices pushing APEs are deliberate fraudsters rather than self-deluded dupes.

    Oh, absolutely. The last decade or so has been full of object lessons in the readiness of dupes to be duped by fraudsters. Melville knew all about it.

  4. David Eddyshaw says

    From the Perlin article:

    Where AI promises magic, the most pressing need is for basic research, driven by communities. In-depth language documentation is difficult and costly, entailing years of work spent finding, getting to know and recording a range of speakers who can showcase as naturally as possible all the things a language can do. Properly probing a single, subtle element of grammar, like the use of tone or the way clauses are chained together, can be a serious accomplishment, not to mention the unsung arts of lexicography, transcription and archiving. When it comes to developing a language for modern life — beyond the daily oral use of its speakers — such steps cannot be skipped.

    We had a representative of such a magic-AI organisation indignantly defending the bona fides of his organisation on this very blog not too long ago.

Speak Your Mind

*