Proto-Dravidian Ancestry?

Jaison Jeevan Sequeira, Swathy Krishna, George van Driem, Mohammed Shafiul Mustak, and Ranajit Das have an article (in preprint, open access) called “Novel 4,400-year-old ancestral component in a tribe speaking a Dravidian language“:

Abstract

Research has shown that the present-day population on the Indian subcontinent derives its ancestry from at least three components identified with pre-Indo-Iranian agriculturalists once inhabiting the Iranian plateau, pastoralists originating from the Pontic-Caspian steppe and ancient hunter-gatherer related to the Andamanese Islanders. The present-day Indian gene pool represents a gradient of mixtures from these three sources. However, with more sequences of ancient and modern genomes and fine structure analyses, we can expect a more complex picture of ancestry to emerge. In this study, we focus on Dravidian linguistic groups to propose a fourth putative source which may have branched out from the basal Middle Eastern component that gave rise to the Iranian plateau farmer related ancestry. The Elamo-Dravidian theory and the linguistic phylogeny of the Dravidian family tree provide chronological fits for the genetic findings presented here. Our findings show a correlation between the linguistic and genetic lineages in language communities speaking Dravidian languages when they are modelled together. We suggest that this source, which we shall call ‘Proto-Dravidian’ ancestry, emerged around the dawn of the Indus Valley civilisation. This ancestry is distinct from all other sources described so far, and its plausible origin not later than 4,400 years ago on the region between the Iranian plateau and the Indus valley supports a Dravidian heartland before the arrival of Indo-European languages on the Indian subcontinent. Admixture analysis shows that this Proto-Dravidian ancestry is still carried by most modern inhabitants of the Indian subcontinent other than the tribal populations. This momentous finding underscores the importance of population-specific fine structure studies. We also recommend informed sampling strategies for biobanks and to avoid oversimplification of ancestral reconstruction. Achieving this requires interdisciplinary collaboration.

I’ll be interested to see what knowledgeable Hatters think about this. Thanks, Dinesh!

Bogan Tolstoy.

I would be remiss if I did not bring to your attention Ander Louis’s Bogan version of War and Peace:

Book 1 of 16 (Complete and unabridged) The greatest epic of all time, now translated into Bogan Australian. Early 1800s Russia wouldn’t be all that bad a place to live, if it weren’t for all the social protocol and that bloody Napoleon bastard, trying to invade the place. In this new translation, Ander Louis has faithfully reconstructed Tolstoy’s epic masterpiece, line for line, in a style the modern reader can understand. Finally, after 150 years, War & Peace is available in Bogan Australian. Book 1 contains the first 28 chapters of War & Peace (approx. 50,000 words)

However, at MetaFilter, where I got the link, the natives are restless:

It’s wrong from the very first word. No bogan is ever going to say “Bloody hell” when “Fuck” would do. […] Vasíli would instantly become Vaz. Prince is his title, not his name, and no bogan would ever use it. This isn’t snark, I’m genuinely cringing. Bogan isn’t a language, it’s a culture, and if this translator took this project at all seriously he’d be doing far more with the source material than plaster cliche export-grade Australianisms all over it. What he’s made here is a literary analogue of blackface. It’s just fundamentally disrespectful to Tolstoy and bogans.
posted by flabdablet

I’m with flabdablet. If your whole project is about putting a famous work in another voice, you’ve gotta get that voice right. “Fucken oath, Vaz, Genoa and Luca are Nappie’s fucken holiday homes now… I’m tellin’ ya, if you still reckon he’s orright – if you don’t reckon he wants a fight – if you still rate that mad bastard, when he’s a total cunt… Well, you can fuck off – we’re done. But yeah, nah – come in, I’m just messing with ya. Siddown, mate. Anyway… how are ya?”
posted by rory

It all makes me want to see War and Peace made over into every variety of English around. (Not Brothers K, though — I don’t want a Bogan Father Zosima, however accurate the dialect.)

Angkentye-yerrtye ileme mpwarele.

This is a great project:

Arrernte people have always had names for places, hills, rivers and other features of the landscape within Arrernte Country. The names tell the ayeye altyerre (creation stories) and link apmere (country) to Arrernte language, people, and culture.

Some Mparntwe (Alice Springs) streets were named after Arrernte plants and
animals, however at the time they street signs were created the Arrernte language written system was not agreed by Arrernte people, so street names were written in a way that didn’t fully capture the language sounds. Since that time, the Central and Eastern Arrernte to English Dictionary has been compiled using the agreed standardised Arrernte spelling system, and this is the system we are using for this project.

This project Angkentye-yerrtye ileme mpwarele loosely translates to ‘Bringing back the right names’. It offers the correct pronunciations and spellings of our street signs using the Central and Eastern Arrernte agreed standard spelling. It is important to the future of the Arrernte language that we use consistent spelling. The QR Codes on the signs link to more information about the meaning of the Arrernte names and how to say them properly.

We have discussed this street sign project with different stakeholders, and everyone has expressed support. Stakeholders can see the opportunity created for local residence and visitors to learn about the local Aboriginal language. The street signs are visually different and are not intended to replace existing street signs, they offer an opportunity for people to engage with Arrernte culture in a respectful way.

If you scroll down, you see a list of place names with pronunciations, maps, and explanations in both Arrernte and English; e.g., Ankerre Park:
[Read more…]

The Cock and the Shelf.

I’m rereading Sorokin’s Норма (The Norm; see this post) more slowly and carefully than the first time, when I skipped over a lot of difficulties in my eagerness to see where it was going, and I’ve been brought up short by a sentence that I simply don’t understand — not because my Russian is insufficient but because I don’t know enough about firearms. In Part 3, the story about Anton revisiting his childhood home, he’s remembering the long-ago days when he went hunting with his father, and we get this sentence:

Воcемнадцатилетний Антон cидел в углу, зажав меж колен cтаpинное шомпольное pужье и тщетно cтаpаяcь оттянуть от полки запавший куpок.

The eighteen-year-old Anton was sitting in the corner, holding an old-fashioned muzzleloader between his knees and trying in vain to pull the ?? from the ?.

The ?? represents “запавший куpок”: запавший literally means ‘fallen’ but I think can also mean ‘stuck’ (клавиши западают means ‘the piano keys are sticking’); куpок is ‘cock, cocking piece (hammer of a firearm trigger mechanism),’ but colloquially (and “incorrectly,” as Russian Wiktionary puts it) it can be used to mean ‘trigger.’ The final question mark is for “полки”; полка means ‘shelf’ but in the context of a firearm means ‘(flash) pan.’ The problem is that not having had anything to do with firearms I can’t visualize what’s going on here and have no way to judge what a correct translation would be. (I don’t even know if this is a rifle or a shotgun, though I presume in the preceding scene they were using shotguns to hunt grouse — see the 2020 discussion beginning here.) Any and all enlightenment is welcome!

Goats, Bookworms, and Quires.

Ann Gibbons at Science reports on how “Researchers use ancient DNA and proteins to read the biology of books”:

Behind locked doors in one of the oldest libraries in Europe, two dozen scholars mill around a conference table where rare medieval manuscripts perch on lecterns, illuminated by natural light streaming in from floor-to-ceiling windows. Most scholars simply look at these precious books while librarians turn the pages for them. But evolutionary biologist Blair Hedges, wearing gray rubber gloves, approaches one book with a mini–cotton swab. He gently dabs the circumference of a hole in the original white leather binding of a rare 12th century copy of the Gospel of Luke. Then, he inserts a tiny gum brush—the kind teenagers use to clean their braces—into another hole to swab its edges. His goal? “To collect bookworm excrement for ancient DNA analysis,” says Hedges, who works at Temple University in Philadelphia, Pennsylvania.

As Hedges magnifies the holes with a lens on his iPhone, book conservator Andrew Honey of the University of Oxford notices that the holes extend all the way back to the oak boards beneath the binding. Honey suggests that furniture beetles laid eggs in the oak before the bookmaker bound the wood in leather. The larvae lurked there for years before developing into adults that exited through the leather. That means it’s likely that “the holes were made by beetles 900 years ago … the oldest example of wormholes I’ve ever seen,” says Hedges, who uses DNA and the size of the holes to assess the type of beetle and so help identify where books were made; the DNA will also help him trace the evolution of the bookworms themselves. […]

At the symposium, Matthew Teasdale, a postdoc in Dan Bradley’s lab at Trinity College in Dublin, reported on the biology of another valuable text: the York Gospels, thought to have been written around 990 C.E. DNA from this book’s eraser shavings showed that, aside from some sheep, its pages were mostly calfskin—mainly from female calves, which was unexpected because cows were usually allowed to grow up to bear offspring. Historic records report that a cattle disease struck the area from 986–988 C.E., so perhaps many sick and stillborn calves were used for parchment, says zooarchaeologist Annelise Binois-Roman of the Université Paris 1 Panthéon-Sorbonne.

The York Gospels also offer a rare record of the people of the book: Almost 20% of the DNA Teasdale extracted from its eraser shavings came from humans or microbes shed by humans, he announced at the symposium. This is the only surviving Gospel book to contain the oaths taken by U.K. clergymen between the 14th and 16th centuries, and it’s still used in ceremonies today. Pages containing oaths were read, kissed, and handled the most, and these pages were particularly rich in microbial DNA from humans, Teasdale reported.

It’s a great read, with sentences like “The book was comprised of skins from an estimated 8.5 calves, 10.5 sheep, and half a goat.” And its mention of quires led Trevor Joyce, who sent me the link (thanks, Trevor!), to say “it prompted me to try guessing the etymology of quire (I drew a blank).” It’s a nice etymology; to quote Wiktionary:

From Middle English quayer, from Anglo-Norman quaier and Old French quaer, from Latin quaternus (“fourfold”), from quater (“four times”). Doublet of cahier.

We discussed quires themselves back in 2004. And the mention of “Annelise Binois-Roman of the Université Paris 1 Panthéon-Sorbonne” makes me grumpy: aren’t we allowed to just say “of the Sorbonne” any more?

Mongolian Etymology.

I took it into my head to wonder about the history of the Mongolian word хот/hot ‘city,’ as seen in (e.g.) Hohhot “Blue City,” and I got increasingly grumpy as I searched unsuccessfully for resources on Mongolian etymology. I found a Wiktionary page, but it has no etymology section. This isn’t as glaring a gap as the lack of an Arabic etymological dictionary, but it’s annoying. Anyone know of anything? And even if there’s no general work, does anyone know the etymology of хот/hot/ᠬᠣᠲᠠ?

Rorts, Corflutes, Stoushes, et al.

Caitlin Cassidy at the Guardian gives us a primer on Australian election terms:

Rorts

Voters hate rorts, and politicians love to accuse each other of rorting. Rorts come in many forms. Election rorts are when the parties distribute taxpayer dollars unfairly to boost their chances of winning votes – like handing out grants for community sports clubs based on colour coded spreadsheets rather than merit. […]

Corflutes

Corflutes are plastered across every major street in Australia for a few weeks when an election is taking place and then disappear into the ether. The word is a registered trademark of Corex Australia, denoting, in a political context, corrugated plastic sheeting used for temporary signage to promote a candidate, found anywhere from shopping centres to trees, highways or front gardens. Essentially, it’s a waterproof poster, but Australians call it a corflute. […]

Stoush

Stoush is a word the media loves to use whenever there is conflict. It spans a wide spectrum: if you’re in an animated debate, that’s a stoush. If you’re brawling, that’s a stoush. If you’ve taken someone to court – stoush. Same goes for policy disagreements, factional differences, campaign disputes. Parties may be stoushing internally, or with other parties, industry groups or lobbyists. The prime minister was even involved in a stoush with Canada over Vegemite.

Needless to say, there has also been an outbreak of corflute stoushes.

Click through for more, including fake tradies, the donkey vote, and spruiking (which we discussed here in 2021). The OED says rort is “Probably a back-formation < rorty adj. [Boisterous, rowdy; saucy; jolly, cheery. Also: dissipated, profligate]”; for stoush it says:

Perhaps compare Scots stash (1851), stush (1892), stoush (1914), all in the sense ‘uproar, disturbance, row, brawl’ (shortened < stushie n.; compare forms at that entry); however, a derivation from this word (although often suggested) would present phonological and semantic difficulties […]

Thanks, Trevor!

Language Models and Latent Space.

Anil Ananthaswamy’s Quanta Magazine piece To Make Language Models Work Better, Researchers Sidestep Language is intriguing, even if I don’t understand a lot of it:

Language isn’t always necessary. While it certainly helps in getting across certain ideas, some neuroscientists have argued that many forms of human thought and reasoning don’t require the medium of words and grammar. Sometimes, the argument goes, having to turn ideas into language actually slows down the thought process.

Now there’s intriguing evidence that certain artificial intelligence systems could also benefit from “thinking” independently of language.

When large language models (LLMs) process information, they do so in mathematical spaces, far from the world of words. That’s because LLMs are built using deep neural networks, which essentially transform one sequence of numbers into another — they’re effectively complicated math functions. Researchers call the numerical universe in which these calculations take place a latent space.

But these models must often leave the latent space for the much more constrained one of individual words. This can be expensive, since it requires extra computational resources to convert the neural network’s latent representations of various concepts into words. This reliance on filtering concepts through the sieve of language can also result in a loss of information, just as digitizing a photograph inevitably means losing some of the definition in the original. “A lot of researchers are curious,” said Mike Knoop (opens a new tab), co-creator of one of the leading benchmarks for testing abstract reasoning in AI models. “Can you do reasoning purely in latent space?”

Two recent papers suggest that the answer may be yes.

Click through for details; I’m afraid I can’t make heads nor tails of latent space, but I’m sure some of my readers can. Thanks, jack!

Too Many Cooks in the Soup.

Mike Colias has an enjoyable WSJ column (archived) about a Ford executive who “kept a meticulous log of mixed metaphors and malaprops uttered in meetings over a decade”:

Mike O’Brien emailed a few hundred colleagues last month to announce his retirement after 32 years at Ford Motor. The sales executive’s note included the obligatory career reflections and thank yous—but came with a twist. Attached to the email was a spreadsheet detailing a few thousand violations committed by his co-workers over the years.

During a 2019 sales meeting to discuss a new vehicle launch, a colleague blurted out: “Let’s not reinvent the ocean.” At another meeting, in 2016, someone started a sentence with: “I don’t want to sound like a broken drum here, but…”

For more than a decade, O’Brien kept a meticulous log of mixed metaphors and malaprops uttered in Ford meetings, from companywide gatherings to side conversations. It documents 2,229 linguistic breaches, including the exact quote, context, name of the perpetrator and color commentary. After one colleague declared: “It’s a huge task, but we’re trying to get our arms and legs around it,” O’Brien quipped: “Adding ‘legs’ into the mix makes it sound kinda kinky.” […]

[Read more…]

Subtitles as a Garden of Forking Paths.

Anatoly Vorobey quotes (in Russian) a Facebook post by Gombo Tsydynzhapov about watching Cowboy Bebop in English with French subtitles and discovering that the two versions radically diverged, so that when the hero says “Hell, I’ll take my chances” the subtitle has “Rien a faire. C’est la vie,” and “I hate theme parks” gets rendered as “Maintenant, tout est fini”, which leads Gombo to the thought that in the Japanese original everything might be totally different. Anatoly adds:

Просто сад расходящихся тропок какой-то. Понравилась оформление идеи (не новой, конечно), что появление дополнительной версии с совсем другим текстом заставляет обе подозревать в недостоверности.

It’s a sort of garden of forking paths. I liked the presentation of the idea (which isn’t new, of course) that the appearance of an additional version with a completely different text makes you suspect both of being unreliable.

There is a lively comment thread; I was particularly struck by jr0, who says “Мне всерьез лингвисты задвигали, что перевести реплику Jesus Christ! как Черт! — норм” [Linguists have seriously suggested to me that translating the line “Jesus Christ!” as “the Devil!” is normal] and when called on it insists stubbornly that it is some kind of… sin? blasphemy?… to render a godly name by a satanic one, even if they are used equivalently in colloquial speech. An interesting form of prescriptivism!