Goats, Bookworms, and Quires.

Ann Gibbons at Science reports on how “Researchers use ancient DNA and proteins to read the biology of books”:

Behind locked doors in one of the oldest libraries in Europe, two dozen scholars mill around a conference table where rare medieval manuscripts perch on lecterns, illuminated by natural light streaming in from floor-to-ceiling windows. Most scholars simply look at these precious books while librarians turn the pages for them. But evolutionary biologist Blair Hedges, wearing gray rubber gloves, approaches one book with a mini–cotton swab. He gently dabs the circumference of a hole in the original white leather binding of a rare 12th century copy of the Gospel of Luke. Then, he inserts a tiny gum brush—the kind teenagers use to clean their braces—into another hole to swab its edges. His goal? “To collect bookworm excrement for ancient DNA analysis,” says Hedges, who works at Temple University in Philadelphia, Pennsylvania.

As Hedges magnifies the holes with a lens on his iPhone, book conservator Andrew Honey of the University of Oxford notices that the holes extend all the way back to the oak boards beneath the binding. Honey suggests that furniture beetles laid eggs in the oak before the bookmaker bound the wood in leather. The larvae lurked there for years before developing into adults that exited through the leather. That means it’s likely that “the holes were made by beetles 900 years ago … the oldest example of wormholes I’ve ever seen,” says Hedges, who uses DNA and the size of the holes to assess the type of beetle and so help identify where books were made; the DNA will also help him trace the evolution of the bookworms themselves. […]

At the symposium, Matthew Teasdale, a postdoc in Dan Bradley’s lab at Trinity College in Dublin, reported on the biology of another valuable text: the York Gospels, thought to have been written around 990 C.E. DNA from this book’s eraser shavings showed that, aside from some sheep, its pages were mostly calfskin—mainly from female calves, which was unexpected because cows were usually allowed to grow up to bear offspring. Historic records report that a cattle disease struck the area from 986–988 C.E., so perhaps many sick and stillborn calves were used for parchment, says zooarchaeologist Annelise Binois-Roman of the Université Paris 1 Panthéon-Sorbonne.

The York Gospels also offer a rare record of the people of the book: Almost 20% of the DNA Teasdale extracted from its eraser shavings came from humans or microbes shed by humans, he announced at the symposium. This is the only surviving Gospel book to contain the oaths taken by U.K. clergymen between the 14th and 16th centuries, and it’s still used in ceremonies today. Pages containing oaths were read, kissed, and handled the most, and these pages were particularly rich in microbial DNA from humans, Teasdale reported.

It’s a great read, with sentences like “The book was comprised of skins from an estimated 8.5 calves, 10.5 sheep, and half a goat.” And its mention of quires led Trevor Joyce, who sent me the link (thanks, Trevor!), to say “it prompted me to try guessing the etymology of quire (I drew a blank).” It’s a nice etymology; to quote Wiktionary:

From Middle English quayer, from Anglo-Norman quaier and Old French quaer, from Latin quaternus (“fourfold”), from quater (“four times”). Doublet of cahier.

We discussed quires themselves back in 2004. And the mention of “Annelise Binois-Roman of the Université Paris 1 Panthéon-Sorbonne” makes me grumpy: aren’t we allowed to just say “of the Sorbonne” any more?

Mongolian Etymology.

I took it into my head to wonder about the history of the Mongolian word хот/hot ‘city,’ as seen in (e.g.) Hohhot “Blue City,” and I got increasingly grumpy as I searched unsuccessfully for resources on Mongolian etymology. I found a Wiktionary page, but it has no etymology section. This isn’t as glaring a gap as the lack of an Arabic etymological dictionary, but it’s annoying. Anyone know of anything? And even if there’s no general work, does anyone know the etymology of хот/hot/ᠬᠣᠲᠠ?

Rorts, Corflutes, Stoushes, et al.

Caitlin Cassidy at the Guardian gives us a primer on Australian election terms:

Rorts

Voters hate rorts, and politicians love to accuse each other of rorting. Rorts come in many forms. Election rorts are when the parties distribute taxpayer dollars unfairly to boost their chances of winning votes – like handing out grants for community sports clubs based on colour coded spreadsheets rather than merit. […]

Corflutes

Corflutes are plastered across every major street in Australia for a few weeks when an election is taking place and then disappear into the ether. The word is a registered trademark of Corex Australia, denoting, in a political context, corrugated plastic sheeting used for temporary signage to promote a candidate, found anywhere from shopping centres to trees, highways or front gardens. Essentially, it’s a waterproof poster, but Australians call it a corflute. […]

Stoush

Stoush is a word the media loves to use whenever there is conflict. It spans a wide spectrum: if you’re in an animated debate, that’s a stoush. If you’re brawling, that’s a stoush. If you’ve taken someone to court – stoush. Same goes for policy disagreements, factional differences, campaign disputes. Parties may be stoushing internally, or with other parties, industry groups or lobbyists. The prime minister was even involved in a stoush with Canada over Vegemite.

Needless to say, there has also been an outbreak of corflute stoushes.

Click through for more, including fake tradies, the donkey vote, and spruiking (which we discussed here in 2021). The OED says rort is “Probably a back-formation < rorty adj. [Boisterous, rowdy; saucy; jolly, cheery. Also: dissipated, profligate]”; for stoush it says:

Perhaps compare Scots stash (1851), stush (1892), stoush (1914), all in the sense ‘uproar, disturbance, row, brawl’ (shortened < stushie n.; compare forms at that entry); however, a derivation from this word (although often suggested) would present phonological and semantic difficulties […]

Thanks, Trevor!

Language Models and Latent Space.

Anil Ananthaswamy’s Quanta Magazine piece To Make Language Models Work Better, Researchers Sidestep Language is intriguing, even if I don’t understand a lot of it:

Language isn’t always necessary. While it certainly helps in getting across certain ideas, some neuroscientists have argued that many forms of human thought and reasoning don’t require the medium of words and grammar. Sometimes, the argument goes, having to turn ideas into language actually slows down the thought process.

Now there’s intriguing evidence that certain artificial intelligence systems could also benefit from “thinking” independently of language.

When large language models (LLMs) process information, they do so in mathematical spaces, far from the world of words. That’s because LLMs are built using deep neural networks, which essentially transform one sequence of numbers into another — they’re effectively complicated math functions. Researchers call the numerical universe in which these calculations take place a latent space.

But these models must often leave the latent space for the much more constrained one of individual words. This can be expensive, since it requires extra computational resources to convert the neural network’s latent representations of various concepts into words. This reliance on filtering concepts through the sieve of language can also result in a loss of information, just as digitizing a photograph inevitably means losing some of the definition in the original. “A lot of researchers are curious,” said Mike Knoop (opens a new tab), co-creator of one of the leading benchmarks for testing abstract reasoning in AI models. “Can you do reasoning purely in latent space?”

Two recent papers suggest that the answer may be yes.

Click through for details; I’m afraid I can’t make heads nor tails of latent space, but I’m sure some of my readers can. Thanks, jack!

Too Many Cooks in the Soup.

Mike Colias has an enjoyable WSJ column (archived) about a Ford executive who “kept a meticulous log of mixed metaphors and malaprops uttered in meetings over a decade”:

Mike O’Brien emailed a few hundred colleagues last month to announce his retirement after 32 years at Ford Motor. The sales executive’s note included the obligatory career reflections and thank yous—but came with a twist. Attached to the email was a spreadsheet detailing a few thousand violations committed by his co-workers over the years.

During a 2019 sales meeting to discuss a new vehicle launch, a colleague blurted out: “Let’s not reinvent the ocean.” At another meeting, in 2016, someone started a sentence with: “I don’t want to sound like a broken drum here, but…”

For more than a decade, O’Brien kept a meticulous log of mixed metaphors and malaprops uttered in Ford meetings, from companywide gatherings to side conversations. It documents 2,229 linguistic breaches, including the exact quote, context, name of the perpetrator and color commentary. After one colleague declared: “It’s a huge task, but we’re trying to get our arms and legs around it,” O’Brien quipped: “Adding ‘legs’ into the mix makes it sound kinda kinky.” […]

[Read more…]

Subtitles as a Garden of Forking Paths.

Anatoly Vorobey quotes (in Russian) a Facebook post by Gombo Tsydynzhapov about watching Cowboy Bebop in English with French subtitles and discovering that the two versions radically diverged, so that when the hero says “Hell, I’ll take my chances” the subtitle has “Rien a faire. C’est la vie,” and “I hate theme parks” gets rendered as “Maintenant, tout est fini”, which leads Gombo to the thought that in the Japanese original everything might be totally different. Anatoly adds:

Просто сад расходящихся тропок какой-то. Понравилась оформление идеи (не новой, конечно), что появление дополнительной версии с совсем другим текстом заставляет обе подозревать в недостоверности.

It’s a sort of garden of forking paths. I liked the presentation of the idea (which isn’t new, of course) that the appearance of an additional version with a completely different text makes you suspect both of being unreliable.

There is a lively comment thread; I was particularly struck by jr0, who says “Мне всерьез лингвисты задвигали, что перевести реплику Jesus Christ! как Черт! — норм” [Linguists have seriously suggested to me that translating the line “Jesus Christ!” as “the Devil!” is normal] and when called on it insists stubbornly that it is some kind of… sin? blasphemy?… to render a godly name by a satanic one, even if they are used equivalently in colloquial speech. An interesting form of prescriptivism!

Remorque.

I just watched Jean Grémillon’s 1941 movie Remorques, part of the Criterion Channel’s French Poetic Realism series (which is introducing me to filmmakers I’d barely heard of, like Grémillon and Julien Duvivier); it’s an excellent movie (with a screenplay by Jacques Prévert), but I’m posting because I didn’t know the word remorque ‘towline,’ and it’s got an interesting etymology — it’s from Latin remulcum:

From Ancient Greek ῥυμουλκέω (rhumoulkéō), from ῥῦμα (rhûma, “tow-line”) + ἕλκω (hélkō, “to drag”); for first element see ἐρύω (erúō, “to pull”).

And its Romance descendants are all distorted in minor ways: Italian rimorchio (from a diminutive *remurculum) and the verbs Portuguese rebocar and Spanish remolcar. If I had read more sea stories and watched more movies set on ships, I’d know more nautical terminology!

Peilz, *bledyos.

Looking at a map of Switzerland, I noticed a town called La Tour-de-Peilz on Lake Geneva and wondered how it was pronounced. Wikipedia told me it was [la tuʁ də pɛ] (ah, the transparency of French spelling!), but then I wanted to know where the name was from, so I proceeded to French Wikipedia, where I found this:

Dans son livre « Noms de lieux des pays franco-provençaux », Georges Richard Wipf écrit que « le gallois blaidd (loup) étant à l’origine des termes bela, belau, bele et bel, ce qui postule blebel, on peut penser que *bleiz a aussi pu évoluer […] en *beilz, d’où *peilz. » L’auteur prend toutefois soin de préciser qu’« il ne s’agit que d’une hypothèse, mais elle expliquerait le nom de Peilz (La Tour-de-Peilz, VD). »

Cette étymologie est toutefois controversée et plusieurs autres explications ont été avancées. Celle retenue de préférence aujourd’hui est une origine remontant à un gentilice latin Pellius, hypothèse confortée par le lieu-dit En Peilz, à l’est de la ville, où ont été retrouvés de nombreux vestiges romains.

I mean, I’m all for trying to peer into the past of words, but the pile-up of “ce qui postule … on peut penser … a aussi pu évoluer” hardly needs to be clarified by “il ne s’agit que d’une hypothèse.” I am irresistibly reminded of the insufferable Brichot.

But that Welsh word blaidd ‘wolf’ is interesting; it goes back to Proto-Celtic *bledyos (etymology unknown: “Probably borrowed from a non-Indo-European substrate language”), whose Old Irish descendant bled (eDIL) means ‘sea-monster; whale.’ There’s a fine piece of semantic development for you! I deduce that the Irish, not having wolves, applied the inherited name to their native sea-monsters. (The modern term for ‘wolf’ is mac tíre ‘son of the land’; make of that what you will.)

Godons.

I’ve started watching Jacques Rivette’s Jeanne la pucelle (it’s almost six hours long, so I’m taking it in chunks, which fortunately it breaks easily into); it’s gorgeous, and Sandrine Bonnaire can do no wrong as far as I’m concerned, but the subtitles are occasionally iffy, and I’m here to complain about one that particularly irritated me. Someone is talking about the warring forces in France and mentions a group that the subtitle calls “the Godons.” I thought I knew a fair amount about the period, but that didn’t ring a bell; after some googling I realized that it was this. OK, apparently “godons” was a variant form of the more familiar “goddams” (though I note that the Wiktionary article says “speculatively connected to English God damn, although the profanity is not attested in Middle English” [see Xerîb’s comment below for further demolition]), but what a lousy way to render it! “The English” would be preferable, but even “the goddams” would give more of a hint to the viewer not versed in Medieval French obloquy. And this from a subtitler who, when Jeanne tells her uncle she wants to go to see the Dauphin and he says “Qui — le roi de Bourges?” renders his response “the Well Served?” Which is not only unintelligible to the non-specialist viewer but misses the entire point of mockingly calling him “King of Bourges”!

Callow.

Dave Wilton has a Big List entry tracing the history of the word callow:

Callow is a word that dates back to the beginnings of the English language, but it has shifted in meaning significantly over the past eleven-hundred years. Today it means inexperienced or naive, and it often appears in the phrase callow youth. But way back when it was associated with aging, for in Old English the word calu meant bald. […] The meaning of callow remained stable through the Middle English period, but in the late sixteenth century the word began to be applied to young birds, who were unfledged, that is without feathers. […] And by the end of the seventeenth century, callow was being used to refer to young and naïve people without allusion to fledgling birds. […] This inexperienced sense would quickly overtake the bald sense, driving the latter out of the language.

I’ll add this to my stock of ammunition to be used against those who object to semantic change (my go-to example has been bead, originally ‘prayer’): “Oh, so you think callow should only mean ‘bald,’ then?”

I was wondering if it was related to Russian голый ‘naked’; Wiktionary says yes, but the OED (entry revised 2016) is more cautious:

Cognate with Middle Dutch calu, cāle (Dutch kaal), Middle Low German kale, Old High German kalo (Middle High German kal, German kahl); further etymology uncertain.

Notes

Further etymology
Perhaps < the same Indo-European base as Old Church Slavonic golŭ naked, bare, or perhaps an early borrowing into Germanic of classical Latin calvus bald (see calvity n.).