Sandra Blakeslee’s NY Times piece “Computing Crime and Punishment,” while very interesting (it’s about the use of the trial records of the Old Bailey to help analyze in detail how the British criminal justice system came to distinguish between violent and nonviolent crimes), doesn’t have much to do with language until the end:
To simplify their task, the researchers turned to the 1911 edition of Roget’s Thesaurus, which sorts 26,000 distinct English words into 1,040 numbered categories called synonym sets. For example, words involving love and affection are in the high 800s, money and wealth in the low 800s. “Kick,” as in striking a blow, is No. 276, while killing is No. 361.
“The beauty of this,” Dr. DeDeo said, “is that for every word we have a number that equates with a meaning” that can be modeled mathematically.
One key finding is the gradual criminalization of violence.
In the early 1700s, violence was considered routine. […]
Over time, the transcripts have more superlatives and intensifiers — words like “very,” “so much,” “most” — in reference to acts of violence. Exaggeration is normal in a courtroom, but violence brings out more hyperbole; if someone steals your wallet, you are upset, but if someone beats you up, you are likely to use stronger language.
Apparently, the Old Bailey corpus is “the largest existing body of transcribed trial evidence for historical crime” and “the most detailed recording of real speech in printed form anywhere in the world.” Thanks, Kobi!