Forensic Linguists in the News.

The Dial (the new “online magazine of culture, politics and ideas,” not the Transcendentalist/modernist predecessor that published Yeats and Eliot) has a Language issue with a number of interesting items, of which I will feature Julia Webster Ayuso’s Can a Comma Solve a Crime?: “How forensic linguists use grammar, syntax and vocabulary to help crack cold cases.” After introducing us to “France’s best-known unsolved murder case,” that of four-year-old Grégory Villemin, Webster Ayuso goes into the history of her topic:

According to forensic linguists, we all use language in a uniquely identifiable way that can be as incriminating as a fingerprint. The word “forensic” may suggest a scientist in a protective suit inspecting a crime scene for drops of blood. But a forensic linguist has more in common with Sherlock Ho[l]mes in “A Scandal in Bohemia.” “The man who wrote the note is a German. Do you note the peculiar construction of the sentence?” the detective asks in the 1891 short story. “A Frenchman or a Russian could not have written that. It is the German who is so uncourteous to his verbs.”

The term “forensic linguistics” was likely coined in the 1960s by Jan Svartvik, a Swedish linguist who re-examined the controversial case of Timothy John Evans, a Welshman who was wrongfully accused of murdering his wife and daughter and was convicted and hanged in 1950. Svartvik found that it was unlikely that Evans, who was illiterate, had written the most damning parts of his confession, which had been transcribed by police and likely tampered with. The real murderer was the Evans’ downstairs neighbor, who turned out to be a serial killer.

Today, the field is perhaps still best known for its role in solving the “Unabomber” case in the United States. […] While U.S. authorities hunted down the Unabomber, the field of forensic linguistics was developing in other countries. The University of Birmingham hosted the first British Seminar on Forensic Linguistics in 1992, bringing together academics from Australia, Brazil, Holland, Ukraine, Greece and Germany. Barcelona’s Pompeu Fabra University has had a forensic linguistics laboratory since 1993. But it wasn’t until the next decade that the field became more structured, with the creation of university research teams, master’s degrees and government-funded police laboratories and agencies.

“It’s still emerging in places outside where it initially started, but it is growing gradually as people are getting trained,” said Nicci MacLeod, a senior lecturer at the Aston Institute of Forensic Linguistics in Birmingham, England, which was established in 2019.

She goes on to discuss authorship attribution (“identifying the author of a given text and, in some cases, shedding light on long-standing literary mysteries”); I like this example:

The computational linguists Florian Cafiero and Jean-Baptiste Camps have done similar work in France. For over a century, scholars had argued that Molière could not have written some of his best work due to his lack of education, suggesting instead that his plays had been ghostwritten by the poet Pierre Corneille. The academics were able to disprove this theory by looking at language, rhyme, grammar and word forms. This established a “clear-cut separation” between plays penned by Molière and works by Corneille.

“You thought that Molière wrote his plays? Well… yes, he did,” Cafiero said with a chuckle, when we met at his offices at the École Nationale des Chartes in Paris, a university specialized in historical sciences, to discuss his work. “It was the total opposite of a scoop.”

I can’t stand those snobs who think only people with an elite education can produce great art! (We talked about forensic linguists earlier this year.)

Among the other pieces in the issue are Yásnaya Elena A. Gil’s To Read (translated from Spanish by Ellen Jones — I posted about Gil here), Ross Perlin’s AI Won’t Protect Endangered Languages (we talked about Perlin just the other day), Winthrop Rodgers’ Saving Yazidi Song, and Teresa Grøtan’s Death in Grammar (translated from Norwegian by Caroline Waight [pron. “wight”]), from which I extract this delightful paragraph about Iglesia Maradoniana (the Church of Maradona), which she encountered during a visit to Argentina to improve her Spanish:

In this particular religious community, which today has hundreds of thousands of members, Maradona’s autobiography is the Bible. Their calendar begins with the year of Maradona’s birth, and their Lord’s Prayer starts as follows: “Dear God, who art in football stadiums, hallowed be thy left foot.” The Credo? “I believe in Diego, almighty footballer, creator of magic and passion.” The first of the Ten Commandments? “The ball shall never be sullied.” The firstborn son in each family is christened Diego, and all members take Diego as a middle name. Believers swear on the ball, couples get married at football stadiums — and in this ministry, it was of course the hand of God, not Maradona’s, that scored the notorious goal when Argentina defeated England 1–0 at the World Cup in 1986.

Thanks, Nick!

Comments

  1. David Eddyshaw says

    The Perlin article is spot on.

    “AI” is a classic bubble. Bubbles can do a lot of damage, both in the inflation and the bursting stage. Cory Doctorow has an interesting recent article on whether this one will leave anything worth much after it bursts (the dotcom bubble did, so it can happen.)

    More and more I incline to the view that the loudest voices pushing APEs are deliberate fraudsters rather than self-deluded dupes. They’re recently shown us their anti-democratic political objectives all too clearly, and poisoning the wells of accurate information, and undermining employment by epic plagiarism of the very same employees, neatly align with those objectives.

  2. David Eddyshaw says

    “AI” = stochastic parrot

    “LLM” = very large-scale automated plagiarism engine

    “Hallucination” = essential nature of LLMs/APEs showing through the hype and the disguising kludges

    “Training data” = original human-created material to be plagiarised

    As Doctorow points out, the hype is directly damaging in itself. “AI” does not need to be actually able to do your job adequately for you to end up unemployed. All that is necessary for those seeking to cash in on the bubble is to convince your boss that it can. That’s where the hype comes in.

  3. More and more I incline to the view that the loudest voices pushing APEs are deliberate fraudsters rather than self-deluded dupes.

    Oh, absolutely. The last decade or so has been full of object lessons in the readiness of dupes to be duped by fraudsters. Melville knew all about it.

  4. David Eddyshaw says

    From the Perlin article:

    Where AI promises magic, the most pressing need is for basic research, driven by communities. In-depth language documentation is difficult and costly, entailing years of work spent finding, getting to know and recording a range of speakers who can showcase as naturally as possible all the things a language can do. Properly probing a single, subtle element of grammar, like the use of tone or the way clauses are chained together, can be a serious accomplishment, not to mention the unsung arts of lexicography, transcription and archiving. When it comes to developing a language for modern life — beyond the daily oral use of its speakers — such steps cannot be skipped.

    We had a representative of such a magic-AI organisation indignantly defending the bona fides of his organisation on this very blog not too long ago.

  5. Christopher Culver says

    Neither your quoted snippet nor the linked article as a whole defines “forensic linguistics” in the particular sense that I personally, at any rate, have heard it used in Europe in recent years: the job of scrutinizing statements made by asylum seekers in order to determine if they really are (on the basis of their dialect etc.) from the specific region they claim to be from.

  6. David Marjanović says

    Can a Comma Solve a Crime?

    “Commas are important people!”

    whether this one will leave anything worth much after it bursts

    “AI” more generally, yes – image recognition for all sorts of scientific purposes comes to mind as an application that already exists. LLMs – would surprise me.

  7. David Eddyshaw says

    Yes, by “AI”-in-scare-quotes I meant only APEs. Part of the odiousness of the hype is that the name Artificial Intelligence, which has included a lot of interesting and lastingly valuable work over many decades, has been impertinently applied to something which is No Such Bloody Thing as a pure marketing tactic.

    Some of the non-APE stuff is also being applied in ethically indefensible ways, of course. The apocalyptic bollocks about imminent singularities and the like is intended to distract us from the very real damage being inflicted already by things like intensive hyper-intrusive workplace surveillance tools. Imaginary bogeymen to stop us looking at the real villainy.

    And the failures to create anything like genuine intelligence are of no importance to the APE salesmen if the victims of their malpractices are powerless. It doesn’t matter to Amazon or Tesla at all if a diligent warehouse or factory worker or driver gets sacked by a crap algorithm, so long as the workers aren’t allowed to unionise and remain individually powerless.

  8. defending the bona fides of his organisation on this very blog not too long ago.

    Where? I don’t remember that.

  9. Trond Engen says

    David E.: “AI” is a classic bubble. Bubbles can do a lot of damage, both in the inflation and the bursting stage. Cory Doctorow has an interesting recent article on whether this one will leave anything worth much after it bursts (the dotcom bubble did, so it can happen.)

    Cory Doctorow: What Kind of Bubble is AI?, Locus, 18 Dec. 2023

    In other words, an AI-supported radiologist should spend exactly the same amount of time considering your X-ray, and then see if the AI agrees with their judgment, and, if not, they should take a closer look. AI should make radiology more expensive, in order to make it more accurate.

    But that’s not the AI business model. AI pitchmen are explicit on this score: The purpose of AI, the source of its value, is its capacity to increase productivity, which is to say, it should allow workers to do more, which will allow their bosses to fire some of them, or get each one to do more work in the same time, or both. The entire investor case for AI is “companies will buy our products so they can do more with less.” It’s not “business custom­ers will buy our products so their products will cost more to make, but will be of higher quality.”

    AI companies are implicitly betting that their customers will buy AI for highly consequential automation, fire workers, and cause physical, mental and economic harm to their own customers as a result, somehow escaping liability for these harms. Early indicators are that this bet won’t pay off. Cruise, the “self-driving car” startup that was just forced to pull its cars off the streets of San Francisco, pays 1.5 staffers to supervise every car on the road. In other words, their AI replaces a single low-waged driver with 1.5 more expensive remote supervisors – and their cars still kill people.

    If Cruise is a bellwether for the future of the AI regulatory environment, then the pool of AI applications shrinks to a puddle. There just aren’t that many customers for a product that makes their own high-stakes projects bet­ter, but more expensive. There are many low-stakes applications – say, selling kids access to a cheap subscription that generates pictures of their RPG characters in action – but they don’t pay much. The universe of low-stakes, high-dollar applications for AI is so small that I can’t think of anything that belongs in it.

    Add up all the money that users with low-stakes/fault-tolerant applications are willing to pay; combine it with all the money that risk-tolerant, high-stakes users are willing to spend; add in all the money that high-stakes users who are willing to make their products more expen­sive in order to keep them running are willing to spend. If that all sums up to less than it takes to keep the servers running, to acquire, clean and label new data, and to process it into new models, then that’s it for the commercial Big AI sector.

    Just take one step back and look at the hype through this lens. All the big, exciting uses for AI are either low-dollar (helping kids cheat on their homework, generating stock art for bottom-feeding publications) or high-stakes and fault-intolerant (self-driving cars, radiology, hiring, etc.).

    David E.: Some of the non-APE stuff is also being applied in ethically indefensible ways, of course. The apocalyptic bollocks about imminent singularities and the like is intended to distract us from the very real damage being inflicted already by things like intensive hyper-intrusive workplace surveillance tools. Imaginary bogeymen to stop us looking at the real villainy.

    Doctorow misses the point that hiring or staff administration in general is a low-stakes game. There’s no particular downside (for the employer) to being a little wrong.

    Christopher C.: Neither your quoted snippet nor the linked article as a whole defines “forensic linguistics” in the particular sense that I personally, at any rate, have heard it used in Europe in recent years: the job of scrutinizing statements made by asylum seekers in order to determine if they really are (on the basis of their dialect etc.) from the specific region they claim to be from.

    The collocation made me realize that this kind of “linguistic vetting” might be a case for AI similar to HR. There’s no particular downside if AI systems get it wrong. No reason to stop at linguistics either. Do the language, networks, travel patterns, family relations, work histories, personal style, etc. of asylum seekers fit the profile of the ethnic, political or religious affiliation they claim?

  10. David Eddyshaw says

    Where? I don’t remember that

    https://languagehat.com/robotsmali/#comment-4591516

    rozele’s comment further on is very pertinent.

    The Robotsmali site has (or had) a link to a plug for an “AI” gadget purporting to help blind people, which could be used as an excellent aid for explaining to people how not to go about helping blind people in West Africa. (I used to work for an organisation that had put a great deal of effort into trying to find real effective and scalable answers to this question. I think this was simply arrogant ignorance of all relevant prior work on the part of Robotsmali rather than a cynical advertising gimmick. I’m always prepared to think the best of people.)

  11. David Eddyshaw says

    Doctorow misses the point that hiring or staff administration in general is a low-stakes game

    No, actually I stole this point from him without attribution. Sometimes, I regret to say, I behave no better than a LLM.

    (Perhaps a routine can be added to a LLM to make it repent its misdeeds?)

    Doctorow makes the point extensively that “AI” of this type actually does work in contexts where the victims have no effective defence against its abuse, so the costs are borne by others, rather than by the users.

    Where it is not economically viable is in the very contexts where it could have real benefits. His example is of a radiologist using it to help spot tumours. This can indeed be very valuable, but you still need the human expert to review the AI’s opinions, or you will get fatal avoidable errors. The use of AI really can improve overall quality, but it won’t let you make substantial savings by sacking human experts. That’s not a promising business model.

    Using AI to help crap management practices is also ultimately counterproductive, but that is not a problem for the bubble profiteer selling the AI.

  12. 1-0???? The game which is remembered for both the Hand of God goal and the Goal of the Century?!? Absolute heresy!

  13. J.W. Brewer says

    I don’t follow these things closely enough for my ignorance to be that significant, but I have never heard of the sort of “forensic linguistics” Christopher Culver mentions being used in a U.S. context. The U.S. like many European countries does have a problem with fraudulent asylum claims (which inter alia makes it more costly and challenging to identify bona fide asylum claims …), but I have not heard anecdotally that the genres of false claims common in the U.S. system are the sort that could be easily shown false by establishing that the claimant doesn’t speak the language variety that would fit the story.

  14. David Eddyshaw says

    Genuine expertise that could reliably achieve such a thing must be very rare, if it exists at all. How many people are available who can tell if a person is really a native speaker of Fur, say?

    However, I can readily see a potential “AI” market there. It perfectly fits the picture: it wouldn’t matter to the vendor or the purchaser that the claims made on behalf of the system were fraudulent, because the actual victims are powerless. And from the purchasers’ standpoint, the objective is not in fact to run the system justly, but to get votes by being brutal to refugees. And in this case, all this really can be achieved with real savings from sacking actual human experts, as quality in decision-making is irrelevant. The appearance of justice is enough.

    A benefit might be in undermining the job prospects of human “forensic linguists” who claim to have such expertise. But such people are likely to find alternative employment quite readily.

  15. Dmitry Pruss says

    Surprised that they didn’t cover “destructology”, a branch of linguistic analysis used in Russian courts to prove subversive nature of texts…

  16. Academics in Arabic linguistics often get approached to do this sort of ‘forensic linguistics’, as you might guess, so I’ve heard a bit about it here and there. My recollection is that the UK government’s preferred firms for doing this sort of thing might as well be this sort of AI. Actual experts sometimes justify doing it on the grounds that it can help strengthen genuine asylum claims. It never felt right to me, though. I’m simply not convinced linguistics can really establish which side of a line randomly drawn by some 20th century colonial official someone comes from.

  17. PlasticPaddy says

    @de
    I believe the most relevant checks would be the plagiarism checks, e.g., has the asylum seeker taken a prepared text and moved stuff around a bit, inserted his own name and dates, etc? I agree in many cases the human checker would be adequate to detect these problems without AI assistance, and might, if the asylum seeker is lucky, be able to show some humanity as well. See ‘l’histoire de Souleymane’ (sorry, should have said SPOILER ALERT–Hat, I think you might like this film and would at least enjoy the subtitled African language text).

  18. Neither your quoted snippet nor the linked article as a whole defines “forensic linguistics” in the particular sense that I personally, at any rate, have heard it used in Europe in recent years: the job of scrutinizing statements made by asylum seekers in order to determine if they really are (on the basis of their dialect etc.) from the specific region they claim to be from.

    That must be a Eurousage; I’ve never seen it in these parts.

    1-0???? The game which is remembered for both the Hand of God goal and the Goal of the Century?!? Absolute heresy!

    Quite right, and I should have caught that, since both goals are vivid in my mind. As penance, I’ll like to the Wikipedia article on the game (whose actual score was 2-1).

  19. @Lameen: I guess it might depend on how fine-grained a distinction is necessary. If the concern were that someone actually from e.g. Oman who couldn’t otherwise get a visa to live in the EU was falsely claiming to be Syrian and to have suffered violence and persecution amidst the civil war there I could see it working pretty effectively. If it’s supposed to detect more local variations, a good fraudulent claimant could have a facially-plausible story about how the initial stages of the Syrian civil war had created a lot of internal displacement and refugee flows such that he/she had relocated from the place he/she was born and raised to a place elsewhere in Syria where the dialect is notably different before the (falsely claimed) Really Bad Things that support the asylum claim supposedly occurred.

    Heck, while many seem confident that Our Lord and God and Savior must have spoken Aramaic with a stock Galilean accent, it’s not implausible that his earlier human life experiences amidst political disorder (not so much birth in Bethlehem as life for a while in diaspora in Egypt) might have influenced His Holy Idiolect.

  20. David Eddyshaw says

    Eurousage

    I initially read this as “eurosausage”, which seems to be a positive sort of concept. On the other hand, I suppose that Eurobismarck might point out that we no more wish to know how EU Directives are made than we wish to know the details of eurosausage-making.

  21. David Marjanović says

    All the big, exciting uses for AI are either low-dollar (helping kids cheat on their homework, generating stock art for bottom-feeding publications)

    Low-dollar perhaps, but high-ruble, at least now that the ruble is rubble. Check out the first 4:25 of this YouTube video.

  22. David Marjanović says

    Goal of the Century

    Bigger all around, but less consequential, than this in the same century.

  23. The FBI and ATF certainly tried forensic linguistic analysis on things written by the Unabomber, but I don’t think it actually helped in any way. Ted Kaczynski was caught when his family members recognized the style and contents of his manifesto.

  24. Bigger all around, but less consequential, than this in the same century.

    Spectacular, thanks! And I’m glad to have gotten a chance to experience what is apparently extremely famous commentary:

    „Und jetzt kann Sara sich noch einen aussichtslos scheinenden Ball einholen, Pass nach links herüber, es gibt Beifall für ihn, da kommt Krankl, vorbei diesmal an seinem […] Bewacher, ist im Strafraum – Schuss … Tooor, Tooor, Tooor, Tooor, Tooor, Tooor! I wer’ narrisch! Krankl schießt ein – 3:2 für Österreich! Meine Damen und Herren, wir fallen uns um den Hals; der Kollege Riepl, der Diplom-Ingenieur Posch – wir busseln uns ab. 3:2 für Österreich durch ein großartiges Tor unseres Krankl. Er hat olles überspielt, meine Damen und Herren. Und warten S’ noch a bisserl, warten S’ no a bisserl; dann können wir uns vielleicht ein Vierterl genehmigen. Also das, das musst miterlebt haben. Jetz bin i aufgstanden, alle Südamerikaner mit ihren Torrufen. I glaub jetzt hammas gschlagn!“

  25. For over a century, scholars had argued that Molière could not have written some of his best work due to his lack of education, suggesting instead that his plays had been ghostwritten by the poet Pierre Corneille. The academics were able to disprove this theory by looking at language, rhyme, grammar and word forms. This established a “clear-cut separation” between plays penned by Molière and works by Corneille.

    I didn’t know that anyone had suggested that Corneille Baconed Molière, or that anyone took the suggestion seriously enough to try to test it, but how valid is the method? Has anyone tried it on writers who have deliberately disguised or just changed their style, which would be the case here? If anyone does, I nominate Gene Wolfe and Steven Brust. Oh, yeah, and Joyce.

  26. Doctorow had me at self driving cars. People are so bad at driving that they should be replaced at this task ASAP. Throw the horse-and-buggy crowd from the steam engine of modernity.

  27. What’s an APE?

  28. The interesting thing to me is how good people* are at driving cars when they’re not drunk. The hourly** casualty charts are incredible. If we could just replace the drunk drivers with self-driving technology

    Editing to retract a bit. I wonder if the chart I’d seen several years back was something more specific like fatal crashes in Chicago, where daytime traffic makes it difficult to attain fatal speeds. The national data isn’t as compelling, with midnight to 3:00 am accounting for fewer fatalities than other 3-hour intervals. Though if you consider the numbers of drivers in those hours, it’s still pretty suggestive.

    * From the term “people” I exclude humans under 25 years old, for reasons I think most of us people will agree with. Though even there I suspect much of my prejudice relates to a greater willingness to drive under the influence.

    ** It’s also possible that some of the hourly variance involves people driving after their normal waking hours, which is hard to disambiguate.

  29. Lars Mathiesen (he/him/his) says

    Under-25s don’t have their risk aversion tuned correctly, so both DOI and speeding are more likely. Over here they are opening up drivers’ licenses for 16-year olds so the regions can save on busses, but only from like 0500 to 1900. (Get to high school and back, or your apprentice job). Let’s see how that goes.

  30. Stu Clayton says

    What’s an APE?

    Automatic Plagiarism Engine. This disobliging term of art was brought to you by DE, whose fulminations against APEs float my boat.

  31. Lars Mathiesen (he/him/his) says

    It’s a nice initialism if read as a real word, too. I was in fact able to retrieve the term from long term memory. An earlier thread here, I’m sure.

    (I did work with some machine learning guys a few years back. They actually increased productivity because instead of like 4% of a random sample of “negative VAT” submissions being fraudulent, it was about 20% when they let a well-trained model pick them [not just a random LLM]. Same number of case workers, 5 times the VAT clawed back. But of course there’s the risk that somebody out there learns to game the model and never get audited. I think they kept some fraction of random picks to combat that).

  32. J.W. Brewer says

    Meanwhile the world of elderly-punk-rock-band reunions has been roiled by an AI controversy involving an alleged plan to somehow electronically modify vocals recorded by a new singer so they sound like the singing voice of the original lead singer who died in 1990. I guess it’s arguably not a “deepfake” if it’s properly authorized by whoever seems to own the relevant rights? We don’t yet know if this is actually happening and if so how convincing-sounding the result will be. https://www.loudersound.com/news/dead-boys-jake-hout-ai

  33. I’m simply not convinced linguistics can really establish which side of a line randomly drawn by some 20th century colonial official someone comes from.

    i’m very much with Lameen here. any such judgement, no matter how expert the linguist involved, is going to be based on stereotypes and completely unjustified fantasies about the neatness and impermeable-borderedness of the world. i believe that to think otherwise is to either not notice, or to pretend not to, the complexities of the actual speaking going on in one’s own environment. this is especially blatant in the u.s., where the equivalent judgements about topolects would be on the order of trying to tell whether someone is arriving in los angeles from texas or tennessee – which is absurd even for cradle-tongue anglophones (who might, like friends of mine from one or the other place, sound stereotypically like their bostonian & new yorker parents, their alabaman grandparents, or the washington dc suburb where their t*an parents raised them). and that’s without even dealing with how things get actually weird, which only starts with things like my sister’s layering of atlantic-canadian and yiddish-inflected new yorker, with a tinge of cantabridgian, and studded with incidental shetlandisms (i pity the forensic linguist who tries to figure out where she’s from or where she lives from her idiolect).

  34. @J.W.Brewer: I don’t anticipate any economically motivated asylum claims from Oman any time soon – certainly not to the UK! – but if, say, a Tunisian were to pretend to be Syrian for fraudulent asylum purposes (that one does happen), the odds are that they would make some effort to fake a Syrian accent too. If they can’t even manage that, they’re unlikely to be able to come up with a story convincing enough to get to the stage of calling in a linguist. Now that linguist will probably spot that their fake accent is a bit iffy. But can they be sure this is because the guy’s Tunisian, or is it because he grew up speaking some embarrassing rural dialect that happens to have evaded Behnstedt and Woidich’s vigilance?

    If they speak, say, Western Neo-Aramaic, you’d think that would be enough to make their case ironclad. But even then there are almost certainly a few speakers of it with passports from some Latin American country, or from Lebanon or something. People get around.

    As for Jesus, peace be upon him, I imagine the primary linguistic impact of a stay in Egypt at that period would have been the opportunity to learn a little Greek; certainly there was no shortage of Greek-speaking Jews there at the time. But of course there are plenty of possibilities. I wonder if any Coptic authors have put together an argument that he learned Coptic there?

  35. I think the APEs are here to stay, in that case (and selfishly I hope they are, since as I’ve said here before, they make my day job a lot easier). The trick is to ignore the hot air about singularities and sentience and use them, like any other tool, only for things they’re actually useful for.

  36. In this context, stylistic studies of Shakespeare compared to Bacon et al. are worth recalling.
    The Dial article on the Borges gatekeeper, Maria Kodama, is one I commend.

  37. David Eddyshaw says

    I’d come across her in the context of her treatment of Norman Thomas di Giovanni, a much superior translator of Borges into English to Andrew Hurley.

  38. Christopher Culver says

    I really don’t understand Rozele’s scepticism about the efficacy of forensic linguistics in asylum claims. I have heard through an expert friend of frequent cases where a Iranian person was pretending to be an Afghan Dari speaker, and the jig was up after just a few minutes.

  39. I imagine that sometimes it works and sometimes it doesn’t.

  40. William Labov, (relatively) famously, helped acquit a suspect based on his accent. The man was accused of phoning in a bomb threat; Labov, called in as an expert witness, demonstrated that the phone call came from a NYC-accented speaker, while the suspect’s accent was eastern New England.

  41. There’s an official APE out there: Automatic Prompt Engineering. Since the output of LLMs varies based on the phrasing of the query, APEs use machine learning to optimize queries.

    Likewise, two drunks leaning on each other are less likely to fall down.

  42. David Marjanović says

    the opportunity to learn a little Greek

    Heavily implied by Mark 7:26–29.

  43. David Eddyshaw says

    I think Ἑλληνίς there just means that she was goyish. The text immediately goes on to say that she was “Syrophoenician by race”, whatever that means.

    The Peshitta actually says khanftha “pagan, gentile”, for Ἑλληνίς here, so that’s how some Aramaic speakers understood it, anyway.

  44. David Eddyshaw says

    D H Lawrence’s extremely silly story The Man Who Died makes quite a point of Jesus’ not knowing Greek, but I don’t think the work is regarded as canonical by the mainstream Christian churches.

    DHL should have stuck to his eddishes.

  45. @CC: language is not usually at the heart of an asylum claim: claiming to be a dari-speaker as opposed to a farsi-speaker makes no sense unless the hoops you’re having to jump through are already based on arbitary state decisions about who counts as in need of asylum – whether dari vs pashto speakers from afghanistan, or afghan people vs iranians – that are usually motivated by state interests that have nothing to do with people’s material safety or need. if linguists are helping enforce those arbitrary decisions, it doesn’t matter whether they’re accurate in their assessment of people’s lects – they’re hindering people’s right to asylum.

    but my point was about the (marginally) subtler aspects: there are, after all, farsi speakers from afghanistan, and using farsi vs dari as a proxy for a person’s origin is an additional layer of garbage on top of the basic problem of treating asylum processes as anything but a venue for understanding why people feel themselves to be unsafe where they’re arriving from, wherever that is, and whoever they are, without first disqualifying anyone who doesn’t fit the moment’s Staatsräson.

  46. J.W. Brewer says

    Why would one assume that the Syrophoenician Woman *didn’t* speak some variety of Aramaic, whether or not she also knew Greek?

    Whether e.g. Pontius Pilate knew Aramaic, and if not what common language he and Jesus conversed in, would seem more interesting questions. (There’s also e.g. the centurion whose servant got healed, but the Gospels at least superficially disagree as to whether he spoke to Jesus directly or via intermediaries.)

  47. And Syrophoenician = men puniqi d-suriya, “from Phoenicia of Syria” (or is it “from Phoenicians of Syria”?) Interesting that Syriac has the same oi to u as Latin there.

  48. David Eddyshaw says

    One of the few things known about Pilate from outside the New Testament seems to be that he was governor of Judaea for a decade, so he presumably at least had time to learn Aramaic.

    Whether a well-connected Roman eques with (presumably) a large entourage of gofers and interpreters would have seen any actual necessity to learn it is another matter. Though the Romans do seem to have been more enterprising about learning barbarian languages than the Greeks. Perhaps it helped that posh Romans were pretty much bilingual anyway.

    And he is represented as being pretty hands-on (so to speak) in his dealings with the natives, rather than leading from the back. Tiberius (no fool when not cavorting on Capri, even if a touch paranoid latterly) presumably thought he was competent enough to leave in post for ten years in a notorious trouble-spot.

  49. David Eddyshaw says

    John’s Gospel, of course, implies that Pilate could actually write Aramaic; though despite ὃ γέγραφα, γέγραφα, I can’t really imagine a Roman governor writing the inscription with his own hand in such a case. There’s hands-on, and then too hands-on. I think this is meant to be more of a case like Hadrian building a wall across Britain.

    Aramaic was of course no mere local patois. It had been the official language of the Western part of the Persian empire.

  50. Syrophoenician = men puniqi d-suriya…

    What an interesting point! I wonder if the waw in ܦܘܢܝܩܐ represents the /y/ of Greek at the time, the outcome of the merger of υ and οι. A spelling ⟨pywnyqy⟩ is also apparently attested in other Syriac texts.

  51. An early use of forensic linguistics for the screening of refugees is the Shibboleth episode.

  52. In the 1885 Schaff translation of Tatian’s Diatessaron (§20), the woman is identified as “Canaanitish” and “a Gentile of Emesa of Syria”. I haven’t taken the time to track down the source of that, perhaps the Arabic version.

  53. “Canaanitish”

    The woman is γυνὴ Χαναναία in Matthew 15:22.

  54. The best case of forensic linguistics is Plot it yourself where Niro Wolfe finds a reverse-plagiarist by analyzing paragraph breaks. I don’t know whether Rex Stout found it even remotely plausible or was just having a laugh. Needless to say, the book is highly enjoyable.

    I have read an ancient book about stylometry by some English author who, in addition to usual application to unknown or disputed authorship, discussed a legal application too. In late Pleistocene, the confession in a criminal case in England and Wales was admissible if it was written in the defendant’s own words (crime shows taught me that now they moved to Pliocene and use a tape recorder), but not necessarily in the defendant’s own hand. The trick was to show that there was no way a defendant could possibly speak in a variety of language recorded by the police.

  55. fwiw
    a) There are arguments, sometimes heated, about how long Pilate, based in Caesarea Maritima, served. Daniel R. Schwartz of Hebrew U. argued he began earlier than most think. Others put the crucifixion later, in 36.

    b) Philo of Alexandria, Every Good Man is Free xii. 75 (LCL): “Palestinian Syria, too, has not failed to produce high moral excellence”–namely, Essenes. I have speculated that Philo’s source here might have been, not Jewish, but Posidonius or Strabo. (J. of Jewish Studies 45, 1994, 295-8.)

  56. David Marjanović says

    The Pliocene was before the Holocene. International Stratigraphic Chart in numerous languages.

  57. If I knew about the Hadean, I’d forgotten. I like the name Late Heavy Bombardment.

  58. David Marjanović says

    …the Pliocene was also before the Pleistocene; that was the point I was trying to make. 🙁

  59. They all occurred long after the Plasticine Era – of Terry Pratchett fame.

    (I think that “Plasticine” is a UK trademark. Basically the same as PlayDough, but supposedly ‘more professional’.)

  60. Plastilina, originally a trademarked brand name, is the common term in Spanish for oil-based modelling clays

  61. David Marjanović says

    Plastilin (final stress, neuter) over here.

  62. David Eddyshaw says

    Plasticine porters have looking-glass ties.

  63. >large entourage of gofers

    First read as large entourage of golfers, which seemed anachronistic yet vividly a propos.

  64. @languagehat: The Late Heavy Bombardment probably happened, but the details of how long it lasted and how much it prolonged the Hadean are still very unclear.

    @Popup: I think Plasticine is a trademark in America too. I remember it was the specific choice of material used by Will Vinton’s Claymation ™ stop motion animation studio (best known for the California Raisins, who started as commercial mascots but became something more). Interestingly Vinton’s longtime right-hand producer was named David Altschul. However, even though he was from Chicago, he came from the only Altschul family in the 1960s Chicagoland white pages that wasn’t related to mine.

  65. DM, to be completely truthful, I made a mistake, I didn’t want to imply that the UK is moving backwards in time. But now that it looks like it…

  66. (I think that “Plasticine” is a UK trademark. Basically the same as PlayDough, but supposedly ‘more professional’.)

    According to Wikipedia, Play-Doh is flour and water with some other stuff, so it hardens when it dries, whereas Plasiticine is gypsum and oils, so it doesn’t dry out.

  67. David Eddyshaw says

    I can vouch for this from personal experience.

  68. J.W. Brewer says

    In a hilarious recent physician-heal-thyself development, a purported academic expert on “misinformation” submitted a declaration (made under penalty of perjury) in federal court in support of a supposedly anti-misinformation statute being challenged on free-speech grounds, but then it transpired that a few of the recent scholarly articles the witness cited to bolster his expert opinion did not actually exist but were themselves AI hallucinations and perhaps even “misinformation.” He eventually submitted a no-doubt-awkward-to-write follow-up declaration admitting the mistake while contending that it did not undermine the substance of his expert opinion.* Potential “forensic linguists” take note!

    https://reason.com/volokh/2024/11/27/acknowledgment-of-ai-hallucinations-in-ai-misinformation-experts-declaration-in-ai-misinformation-case/

    *Which may be entirely fair if you accept that it is common for most of the contents of a given scholarly article’s bibliography to be there essentially for decorative purposes as a genre convention and not because the author of the article has actually read the substance of the cited sources and been actually influenced thereby.

  69. I see, weirdly, that the blog software converted my “(tm)” into a special character with a superscript. It would not annoy me so much had it not also inserted a gratuitous space before it.

    Play-doh,* the proper brand-named product, does not dry out (at least, not for a long time) if it is kept in a sealed container. It was possible to create a cheap homemade imitation with flour, water, Elmer’s glue, and food coloring, but that dried out a lot faster than the commercial product. My beloved AP U. S. History teacher, Jim Nicholson, occasionally supplied us with the homemade version, which his wife (who taught second grade) manufactured in substantial quantities. The use of it that I particularly remember was Tim Cloran creating a five-inch Uncle Sam to be shown walking the ups and downs of American history. Every couple weeks, we would produce a new set of posters to decorate the classroom (that was educational practice in the early 1990s) leading to such memorable descriptions as “nothing and a whole lot of pineapples” (for American imperialism in Hawaii). Mr. Nicholson left the Play-doh Uncle Sam up indefinitely, however, in honor of its impressive artistry, until one Monday we came back to find Sam had dessicated, fallen off the butcher paper, and shattered on the classroom floor.

    * Not gonna recognize the trademark here, due to the above.

  70. David Eddyshaw says

    The only two doctors I’ve known who put in a lot of time as “expert witnesses” were not people I would have gone to for actual medical care in their own specialties.

    (This is not necessarily damning: being a competent expert witness involves acquiring a good many skills which are not relevant to clinical practice, and it’s very time-consuming if you do it seriously. In the UK, many actually take this up after having retired from clinical practice. However, I strongly suspect that juries are generally under the unfortunate misapprehension that an “expert witness” is the same thing as “an expert in their field.” This is most definitely not the case.)

  71. David Marjanović says

    de.wikipedia equates Plastilin with plasticine. Surprises me – it feels much lighter than plaster.

    Also, yeah, don’t cite papers you haven’t read. There are bound to be surprises in there.

  72. scholarly article’s bibliography … not because the author of the article has actually read the substance of the cited sources

    Indeed. Only yesterday in another place, there was the spectacle of a senior Professor writing on a celebrated topic named for another Professor (ex-)Emeritus. Properly cited in the biblio, but no actual quotes from any of the PE’s (many) publications. Turns out the sP had gotten it all backwards because relying on wikip.

    That other place often fulminates (and quite right too) about how journalists/gossip columnists/dilettantes feel they can just weigh in on Linguistics topics because we all speak language, innit? I’m tempted to make similar fulminations on behalf of the Amalgamated Union of Philosophers, Sages, Luminaries and other Professional Thinking Persons. But I know it’ll do me no good.

  73. David Eddyshaw says

    Too bloody right.

    (In particular, I have long since grown very tired of neuroscientists who imagine that they have Solved Philosophy because it’s all just brain physiology, innit.)

  74. I’m tempted to make similar fulminations on behalf of the Amalgamated Union of Philosophers, Sages, Luminaries and other Professional Thinking Persons. But I know it’ll do me no good.

    Well, it might be good for your mental health. And I’m qualified to suggest that because I have a mind.

  75. Stu Clayton says

    I have half a mind to dismiss the entire mind business. Nobody should mind.

    The noun “mind” triggers a farrago of history-of-philosophy associations in the learnèd. In your average Joe it does no such thing. It’s one of those words whose primary function is to mark affiliation with some vaguely identifiable social group, just as do “motherfucker”, “snowflake” and “fricative”. Then secondary functions come into play as the conversation continues, or doesn’t.

  76. Lars Mathiesen (he/him/his) says

    So what’s the difference (if any) between ‘mind’ and ‘consciousness’? I work on the theory that all individuals of the species Homo Sapiens are conscious and deserving of the privileges set out in the Declaration of Human Rights, politics and philosophers be damned.

  77. Stu Clayton says

    So what’s the difference (if any) between ‘mind’ and ‘consciousness’?

    Ha ha, you can’t catch me with that meta-semantic tar-baby !

    The words ‘consciousness’ and ‘privilege’ are absent from the Declaration of Human Rights. The word ‘mind’ occurs there once, in the fixed expression ‘keeping this Declaration constantly in mind’, a trite implorement neither philosophical, political nor poetical.

    Not even the United Nations needs The Concept of Mind.

  78. Lars Mathiesen (he/him/his) says

    The UN takes it as self-evident what a human is, but some politicians take it upon themselves to narrow the definition. Not a very human thing to do, if you ask me. (Or too human, maybe, but I don’t like it). And that’s when being recognized as a human being becomes a privilege. The bit about consciousness is my own attempt at supplying the missing definition.

  79. Stu Clayton says

    I’m surprised that you seem so defensive about what counts as “human”. You must move in unusually contentious circles. My impression is that people nowadays prefer to bicker over whether animals count as “sentient”.

    This playing around with vague words does not engage my attention. The issues that get to me are at the everyday level of acquired, habitual behavior – people who don’t want to be treated in the ways they treat some other people and animals. I’m with Lecter on this one. Dr. Lecter verhält sich im ersten Jahr seiner Haft sehr zuvorkommend, sodass seine Sicherheitsmaßnahmen gelockert werden.

  80. I’m tempted to make similar fulminations on behalf of the Amalgamated Union of Philosophers, Sages, Luminaries and other Professional Thinking Persons.

    I wonder what the AUPSLPTP makes of Derek Parfit. (Parfit “would peddle in the nude on his exercise bike, reading philosophy; he would brush his teeth for hours, reading philosophy; every day he would eat the same breakfast of muesli and yoghurt and the same dinner of carrots, cheese, lettuce, and celery, to maximise his time for philosophy; he would make coffee using water straight out of the tap, to maximise his time for philosophy; he would take a mixture of vodka and pills every night to help him sleep, since he couldn’t stop thinking about philosophy.”)

  81. As the Andrews Sisters put it:

    #
    Mr. Whatchacallim, whatcha doin’ tonight?
    Hope you’re in the nude, because I’m feelin’ just right!

    How’s about a corner with a table for two?
    Where the music is mellow and the nous is too !
    There’s no chance romancin’ with a blue attitude!
    You’ve got to do some thinkin’ and get in the nude !
    #

  82. I’m trying to imagine peddling in the nude on an exercise bike. A memorable if ineffective sales technique.

  83. I think it depends on the peddler and the location of the bike.

    This is leading me to imagine, or to try not to imagine, forensic linguists in the nude.

  84. Someone once regaled me with stories of eccentric mathematicians; one of them would lock himself in his office and work in the nude.

  85. > he would make coffee using water straight out of the tap

    What am I missing here? Do more than one in ten thousand do anything else?

  86. In other words, he didn’t heat the water — he presumably just added instant.

  87. Here in Cologne I can get hot water out of the tap.

  88. Maybe working in the nude helps with finding the naked truth.

  89. Stu Clayton says

    Blumenberg: Die nackte Wahrheit [haven’t read this yet, but I remember his claiming, in one of his books, that overzealous ecdysis has unexpected costs. I can’t say that that has been my experience]:

    #
    Hans Blumenberg verfolgt in diesem späten Nachlasstext die Figur der nackten Wahrheit durch die Philosophiegeschichte, allerdings mit einer verstärkten Aufmerksamkeit für die Kosten jenes Enthüllungsgestus. Nietzsche, der Verteidiger der Rhetorik, und Freud, der die Entwicklung seiner Theorie ohne Rücksicht auf das Wohl einzelner Patienten verfolgt habe, sind für Blumenberg dabei die zentralen Antipoden.
    #

    Zentrale Antipoden is a failed mix of metaphors. Suhrkamp should be ashamed of itself.

  90. David Marjanović says

    Here in Cologne I can get hot water out of the tap.

    I discovered embarrassingly recently I can make green tea with the hot water from the tap here. The building is old-fashioned enough that the hot water has 90 °C, not the 60° that have become usual in more recent decades (because 90° is pretty hard on the pipes). Just open the tap till the sink steams, then close it, hold the teapot under it and open again.

    Zentrale Antipoden

    “Das schlägt dem Fass die Krone ins Gesicht.”

  91. Here in Cologne I can get hot water out of the tap.

    I assume he did too. Most coffee aficionados do not consider coffee made in such a way to be in any way enjoyable.

  92. PlasticPaddy says

    @hat
    Precisely the point. Enjoyment would detract from the study of philosophy.
    @stu
    Enthüllungsgestus–I don’t think I am ready for this, iet this cup pass…

  93. Stu Clayton says

    The building is old-fashioned enough that the hot water has 90 °C, not the 60° that have become usual in more recent decades (because 90° is pretty hard on the pipes)

    I’ve always thought that 90 °C water could cause bad burns – in children and old folks who aren’t always paying attention. I think our Fernwärme comes in at 40°, but maybe 60°, and is then brought up to speed by in-house heaters for the central heating system.

  94. Stu Clayton says

    @PP: Enthüllungsgestus–I don’t think I am ready for this, iet this cup pass…

    ” ♪ You can leave your hat on ♪” [TIL this was written by Randy Newman !

  95. David Marjanović says

    I’ve always thought that 90 °C water could cause bad burns – in children and old folks who aren’t always paying attention.

    Yes, that, too.

  96. Lars Mathiesen (he/him/his) says

    @Stu, it’s not the circles I move in, it’s what I hear on the morning news. The world seems to be full of people who only want to count their nearest and dearest, a.k.a. constituents, as humans. None mentioned, none forgotten.

Speak Your Mind

*