I’m, Like, “Please.”

Time for another episode of Ask the Hatters! I was reading Jill Lepore’s New Yorker piece “Does A.I. Need a Constitution?” (March 23, 2026; archived) when I found myself flummoxed by the quote at the end of this passage:

A.I. companies’ democratic experiments quickly came to an end. This has made many people more rather than less anxious about A.I., especially in the past few months, owing not least to the newsworthy departures from leading A.I. companies of a number of high-profile safety and alignment researchers. “‘Shoot, the world is not paying enough attention to this’ is a way we all used to feel,” [Divya] Siddarth told me. “Now my mom calls me and says, ‘I saw on the Indian news that some guy resigned from Anthropic,’ and I’m, like, ‘Please.’”

I like to think I’m pretty well versed in the ways of spoken and written English after many decades of speaking and reading it, and I can usually interpret from context what an expression means even if it’s used in an unexpected way, but I have absolutely no idea what the purport of “I’m, like, ‘Please’” might be. Is it “Please, why are you telling me this?” Is it “Please, that’s bullshit”? Is it “Please, that’s not even news”? What’s it all about, Alfie?

Armenians Learning Greek in Ancient Egypt.

Danny Bate featured here just a couple of weeks ago, but he’s got another post I can’t resist sharing: The Armenian Who Learned Greek in Ancient Egypt. This is another “who knew?” moment for me:

Written in Armenian letters for an unknown individual navigating the Greek-speaking society of Roman Egypt, this document is an absolute goldmine of historical and linguistic information. It’s both a testament to a multicultural Mediterranean world, and a valuable early witness to the Armenian language and its speakers. This is in spite of the fact that it doesn’t contain a single word of Armenian. […]

This document, cautiously dated to around the 5th–7th century AD, is a very early example of the Armenian alphabet, and the only one written with papyrus for its material. Yet it doesn’t come from anywhere near lands ever known as ‘Armenia’, nor does it write down Armenian speech. Its provenance is unclear. The French scholar Auguste Carrière bought the parpyrus from a dealer at the end of the 19th century. Scholars worked off a photograph of just one side until the original was rediscovered in 1993 by historian Dickran Kouymjian at the French Bibliothèque Nationale (designation: BnF Arm 332). Before Carrière, the trail goes cold, but the arid, papyrus-preserving climate of Egypt is the likeliest resting place. As for its language, the document is nothing but words of Greek.

Line after line, the document faithfully renders nouns, adjectives, verbs, phrases and whole sentences of Greek in Armenian letters. […] Now, I see two seams of information to be excavated from the papyrus: one about historical language (quelle surprise), but another about historical society. Let’s dig into the first.

[Read more…]

Do as Analytic Causative.

I was led down a rabbit hole today by ktschwarz, who linked to this 2011 Log post, whose long comment thread I read with fascination. I was particularly struck by a comment by Suzanne Kemmer which I must have read at the time, since I commented later on, but which I’d completely forgotten in the ensuing decade and a half; since it’s so interesting, I’m reposting it here in the hope of both enlightening the multitudes and remembering it myself:

On till death do us part:

I’ve researched the various analytic causative constructions in the history of English. The “make” causative as in it made me laugh only started to emerge in Middle English. An older analytic causative, occurring in Old English and persisting through the Middle English period, was [don (the ancestor of Modern English do) + (direct object) + INF]. So “it did us laugh” was the normal way of saying “it made us laugh”.

Till death do us part means ‘until death causes us to part’. The main verb do is in the subjunctive, to indicate irrealis. That’s why it does not have a 3rd person sg. marker in this preserved formula.

In Old English the don analytic causative construction also did not take the to that precedes infinitivals in most of the modern infinitival complement constructions. to only became grammaticalized later, and never made it to constructions with make, let, and the later have causative.

[Read more…]

Bitch: A History.

Karen Stollznow’s Aeon essay is knowledgeable and well written, but if it were only about the changing semantics of bitch, I probably wouldn’t have linked it, figuring it wouldn’t add much to the collective knowledge of the Hattery. But Stollznow is a linguist, and she has passages of less obvious material that warmed my heart:

In its most literal sense, a bitch is a female dog, and this is also the word’s earliest meaning. Because bitch feels so contemporary, so casually present in everyday speech, it’s easy to assume it’s a relatively recent addition to the language. The etymology, however, tells a different story. ‘Bitch’ meaning ‘female dog’ dates to around 1000 CE, giving the word a pedigree that stretches back more than 1,000 years. It is older than ‘fuck’ and ‘cunt’, and older than many of the insults we now think of as timeless.

In those early centuries, the word didn’t quite look, or sound, the same. Bitch is an Old English word, inherited from Germanic, and during the Anglo-Saxon period it would have been unfamiliar to modern readers. Old English was the spoken and written language of the time, though literacy was limited, and bitch appeared as bicce, pronounced roughly as ‘bitch-eh’.

The earliest recorded use of bitch is from a medieval text known as the Medicina de Quadrupedibus – Medicines from Four-Footed Creatures: a compendium of traditional remedies made from animal parts. Originally written in Latin and translated into Old English in the 11th century, the manuscript contains two early examples of bitch used in its literal sense. […]

[Read more…]

Extracting Books from LLMs.

The arXiv paper Extracting books from production language models by Ahmed Ahmed, A. Feder Cooper, Sanmi Koyejo, and Percy Liang is alarming but not in the least surprising. The abstract:

Many unresolved legal questions over LLMs and copyright center on memorization: whether specific training data have been encoded in the model’s weights during training, and whether those memorized data can be extracted in the model’s outputs. While many believe that LLMs do not memorize much of their training data, recent work shows that substantial amounts of copyrighted text can be extracted from open-weight models. However, it remains an open question if similar extraction is feasible for production LLMs, given the safety measures these systems implement. We investigate this question using a two-phase procedure […]. With different per-LLM experimental configurations, we were able to extract varying amounts of text. For the Phase 1 probe, it was unnecessary to jailbreak Gemini 2.5 Pro and Grok 3 to extract text (e.g, nv-recall of 76.8% and 70.3%, respectively, for Harry Potter and the Sorcerer’s Stone), while it was necessary for Claude 3.7 Sonnet and GPT-4.1. In some cases, jailbroken Claude 3.7 Sonnet outputs entire books near-verbatim (e.g., nv-recall=95.8%). GPT-4.1 requires significantly more BoN attempts (e.g., 20X), and eventually refuses to continue (e.g., nv-recall=4.0%). Taken together, our work highlights that, even with model- and system-level safeguards, extraction of (in-copyright) training data remains a risk for production LLMs.

Écrasez l’infâme ! And if you’re tired of thinking about the evils of LLMs, I bring you news of An Old Welsh Reader, edited by Simon Rodway:

This reader contains edited texts, with English translations, of all the independent texts extant in manuscripts of the ninth, tenth, and eleventh centuries, with a selection of twelfth-century texts. They are accompanied by extensive notes and glossaries, along with an introduction which considers the prehistory of Welsh and its relationship with other Celtic languages. The volume also contains a comprehensive list of the sources of Old Welsh and an outline grammar: the first specifically dedicated to Old Welsh to appear in English. Appendices contain editions of one of the very few ancient Celtic texts from Britain, the Bath pendant, and the only sizeable text in another early medieval Brittonic language, the Old Cornish portion of the Leiden leechbook.

Now that’s my idea of a good time.

Kapewu?

Joel at Far Outliers posted excerpts from an article by Patryk Zakrewski titled Kapewu? A Guide to Old Polish Slang, and I’ll post some excerpts from his excerpts:

In Kraków, he was called an ‘ancymon’, while in Lembryczek (pre-war slang for the city of Lviv), a street urchin was a ‘baciar’ (from the Hungarian ‘betyár’ – a hoodlum or goon). A baciar spoke bałak, a Lvovian slang. Elsewhere in Galicia, such rascals and scoundrels were called, in the plural, ‘sztrabancle’ (from the German ‘strabanzen’ – to loiter), and in Poznań, they went under the names of ‘szczuny’, ‘zyndry’ or ‘ejbry’. There were, of course, many other similar terms, because Poland was also full of andrusy and wisusy.

In Warsaw, and especially in its riverside neighbourhoods of Powiśle and Czerniaków, a street urchin was simply an ‘antek’ – which is also a common diminutive of the name Antoni. […]

A birbant, a bon vivant, or a bibosz – somebody leading a riotous life, never one to avoid fun – was known to bradziażyć. In Old Polish, you could similarly say that such a person bisurmani się or lampartuje (all terms for partying). He would flanerować (roam) from pub to pub, often tempted to gamble. This usually made it easy for him to wyprztykać się z floty (run out of money)… but there’s no glik (luck) without risk!

As a result of bradziażenie, it’s easy to become a bradziaga. This word comes from Russian and designates a vagrant or globetrotter. Such a free-floating person was known in Lviv as a ‘makabunda’ (a distorted form of ‘vagabond’). In Silesia, a ragamuffin was a ‘haderlok’ or a ‘szlapikorc’, while in Poznań, he would be called a ‘łatynda’, ‘opypłus’ or ‘szuszwol’.

Menel’, a word for a ‘bum’, still used in all parts of Poland, has an interesting etymology. In one of his pre-war columns, Stefan Wiechecki described this dialogue, reportedly overheard in a courtroom:

[Read more…]

A.I. Is Writing Fiction.

I know this is being discussed everywhere, and I try to avoid bandwagons and the news of the day, but damn if this isn’t too worrying to let slide. Alexandra Alter writes for the NY Times (archived):

For months, speculation has been building online that a buzzy horror novel, “Shy Girl,” was written with the help of A.I. The novel, about a desperate young woman who is held hostage by a man she met online and forced to live as his pet, was self-published in February 2025. The book quickly found an audience among horror fans, and Hachette published it in the United Kingdom last fall and planned to release it in the United States this spring, billing it as “an unapologetic, visceral revenge horror novel.”

Earlier this year, Max Spero, the founder and chief executive of Pangram, an A.I. detection program, heard of the claims about “Shy Girl” and decided to run a test of the full text. Its results indicated that the book was 78 percent A.I. generated. “I’m very confident that this is largely A.I. generated, or very heavily A.I. assisted,” said Spero, who posted his research on X in January. […]

In response to questions from The New York Times about the A.I. allegations against “Shy Girl,” Hachette told The Times that its imprint Orbit has canceled plans to release the novel in the United States and that Hachette will discontinue its U.K. edition.

You can get more details at the link; they don’t really matter, and the facts about this particular novel don’t really matter — it’s clear that so-called “AI” (large language models) will soon be producing work that nobody will be able to prove did not come from a human mind. I realize some will say “Therefore AI is intelligent!” and others will say “Who cares? Let a thousand flowers bloom!,” but as an old-fashioned humanist I feel the foundations crumbling. Will we have to go back to telling tales by the campfire (while making sure the tale-teller isn’t plugged in)?

Japanese Glossary of Chopsticks Faux Pas.

From Nippon.com, a spectacular Japanese Glossary of Chopsticks Faux Pas:

From bad manners to taboo, there are certain ways of using chopsticks that are considered as going against dining etiquette. These various acts, known as kiraibashi, are listed below.

(Listed in Japanese syllabary order)

🥢 あげ箸 Agebashi
To raise the chopsticks above the height of one’s mouth.

🥢 洗い箸 Araibashi
To clean the chopsticks in soup or beverages.

🥢 合わせ箸 Awasebashi (also known as 拾い箸 hiroibashi or 箸渡し hashiwatashi)
!!! (Serious) To pass food from one pair of chopsticks to another. This is taboo due to the custom after a cremation service of picking up remains and passing them between chopsticks.

That’s just the start; there are dozens of them, and it’s fun from both linguistic and cultural points of view. I got the link via MetaFilter, where most of the comments are knowledgeable and/or appreciative but inevitably some are the tedious “ooh, how hoity-toity, fuck that” responses that for some reason people feel impelled to share. Yes, cultures have “right” ways and “wrong” ways to do many things, and they are often not “rational” — get used to it! Also, there is a comment that made me sad and gloomy:

Can anyone with more culture than me comment on the etymology of chopsticks? We usually say hashi in our house cause realizing ‘chop’ is an old cowboy slang for ‘cooked food’, chopsticks seems about as racist as calling a fancy spoon a ‘grub-handle’.

Nobody knows where the “chop” came from (see the brief discussion at Wiktionary), but it doesn’t really matter: people who are determined to avoid any possible violation of progressive standards don’t care about facts, random guesses will do as an excuse. The English word is chopsticks, end of story; if you want to say hashi, be my guest, but you might as well sing The Vapors.

Natural Selection and Language Genes.

Dmitry Pruss sent me a link to “Natural selection and language genes in humans” by Rob DeSalle, Guilherme Lepski, Analia Arévalo, et al. (Scientific Reports 16:9382, 17 February 2026; open access), adding “I am not ready to believe any of it, but technically it says that the genetic basis of speech consisted of a broad network of genes with the foundations laid back in the ape times and most of the subsequent changes made during the emergence of the common ancestor of our species, Neanderthals and Denisovans.” I too am not ready to believe any of it, but I don’t have the technical background to make any useful judgments, so I present it for your appraisal. The abstract:

In this study we construct lists of candidate genes for articulate language. Analysis of coding regions of over 100 candidate genes for the effects of natural selection (directional episodic selection and relaxed/intensified selection) in the various lineages of primates (thirty-four nonhuman primate species, plus Homo sapiens Neanderthals and Denisovans) revealed a burst of altered selection effects on neural genes at the node leading to the Homo sapiens-Neanderthal-Denisova triad, followed by bursts of selection effects on neural genes related to language in both the Denisovan and Neanderthal lineages. Those latter increases in involvement of neural genes in Neanderthals and Denisovans can be contrasted with the missing or slight response to selection on those same genes in the H. sapiens lineage. The genes involved in these bursts can mostly be classified as involved in synapse structure and maintenance. We develop a hypothesis for how synaptic efficiency could be related to language acquisition in these lineages.

Thanks, Dmitry!

Convivencia.

Robyn Creswell’s NYRB review (February 22, 2024; archived) of On Earth or in Poems: The Many Lives of al-Andalus by Eric Calderwood should be worth reading for anyone interested in the period of Islamic rule over the Iberian Peninsula, but what drives me to post is this (bold added):

The most popular tool in this interpretive kit, which a host of thinkers have used to understand al-Andalus, is the concept of convivencia, or coexistence. Many English-language readers encountered this idea in the scholar María Rosa Menocal’s The Ornament of the World (2002), a lyrical portrait of what she calls medieval Spain’s “culture of tolerance.” […]

The idea of convivencia, though often associated with Andalusia, is not Andalusian: its roots lie in the much more recent past. The word was first used in the peculiar—and conveniently vague—sense of religious and ethnic coexistence by the Spanish historian and literary critic Américo Castro in his book España en su historia (1948). Borrowing the term from philology, where it denoted the struggle for supremacy among vernacular variants of a word, Castro gave it an existentialist turn, using it to characterize the daily interaction between Christian, Muslim, and Jewish “castes,” which he took to be the basis of Spanish identity.

I find it hard to believe that a word which appears to mean simply ‘living together’ (and is so defined in the RAE’s Diccionario de la lengua española) originally had the specialized sense of ‘the struggle for supremacy among vernacular variants of a word’ and that this had to be borrowed and repurposed by Castro, but since there is no OED for Spanish, I have no way of finding out. Anybody know the history of this word?