A Sanskrit Discovery.

December 16, 2022 by languagehat 34 Comments

BBC News reports on a new finding about an old text:

A Sanskrit grammatical problem which has perplexed scholars since the 5th Century BC has been solved by a University of Cambridge PhD student. Rishi Rajpopat, 27, decoded a rule taught by Panini, a master of the ancient Sanskrit language who lived around 2,500 years ago. […]

Mr Rajpopat said he had “a eureka moment in Cambridge” after spending nine months “getting nowhere”. […]

Panini’s grammar, known as the Astadhyayi, relied on a system that functioned like an algorithm to turn the base and suffix of a word into grammatically correct words and sentences. However, two or more of Panini’s rules often apply simultaneously, resulting in conflicts.

Panini taught a “metarule”, which is traditionally interpreted by scholars as meaning “in the event of a conflict between two rules of equal strength, the rule that comes later in the grammar’s serial order wins”. However, this often led to grammatically incorrect results.

Mr Rajpopat rejected the traditional interpretation of the metarule. Instead, he argued that Panini meant that between rules applicable to the left and right sides of a word respectively, Panini wanted us to choose the rule applicable to the right side. […]

His supervisor at Cambridge, professor of Sanskrit Vincenzo Vergiani, said: “He has found an extraordinarily elegant solution to a problem which has perplexed scholars for centuries. “This discovery will revolutionise the study of Sanskrit at a time when interest in the language is on the rise.”

The last quote is ridiculously hyped, of course, but this is genuinely exciting news for anyone interested in Sanskrit. (Assuming, of course, that it’s true…) Thanks, Trevor!

Comments

David Eddyshaw says

December 16, 2022 at 4:02 pm

I’m already impressed by anybody who actually reads Panini at all.

(The last sutra is apparently a a, which in context means “the segment which for convenience we have been representing throughout as a is in fact realised [ə].” Yes, really.)

Victor Mair’s note on this at LLog attributes to a Laura Baisas the comment “the discovery makes it possible to translate any word written in Sanskrit.” This doesn’t actually seem to appear in the Popular Science article. Perhaps somebody had a quiet word …
languagehat says

December 16, 2022 at 4:37 pm

The last sutra is apparently a a, which in context means “the segment which for convenience we have been representing throughout as a is in fact realised [ə].”

Yes, that’s the most famous bit, which I’ve remembered for almost half a century now, along with a few lines of the Rig Veda and Sakuntala.
David Marjanović says

December 16, 2022 at 4:46 pm

I thought it meant “the phoneme /a/ is (always) pronounced [ɐ]”?
David Eddyshaw says

December 16, 2022 at 5:04 pm

You say Panini, I say Potato …
Noetica says

December 16, 2022 at 7:32 pm

Hm yes, ˈPanini versus paˈnini.

You say panini, I say bastoncelli di pane;
You say perˈgola, I say ˈpergola.
[chorus]

(By the way David E, SO sorry about Tuesday. If only the Committee had agreed with our cunningly back-channeled recommendation to take Manhattan first. Ah well. There’s always next kalpa.)
David Eddyshaw says

December 16, 2022 at 8:07 pm

Manhattan?

Oh, yes! That was the plan. Of course. Of course. I misunderstood you for an attokalpa.
mollymooly says

December 16, 2022 at 10:16 pm

Interest in Panini is indeed peaking right about now.
David Eddyshaw says

December 16, 2022 at 10:25 pm

Wow! Is there anything he can’t do?
David Eddyshaw says

December 18, 2022 at 11:20 am

It occurs to me that, while this seems to vindicate Panini’s undoubted genius, the story also implies that over the past two millennia, Panini has been held in the highest esteem as a teacher at the same time as his text was interpreted as making systematically wrong predictions about Sanskrit morphology.

(It may be that all such wrong predictions were, in fact, marginal and/or trivial; if so, that perhaps tends to undermine the idea that this is an epoch-making discovery.)

According to Macdonell, the rules of interpretation which P was thought to have followed in his grammar are not to be found in his actual work, but are first mentioned by Katyayana.
January First-of-May says

December 18, 2022 at 1:33 pm

his text was interpreted as making systematically wrong predictions about Sanskrit morphology

For that matter, this story also implies that there are actual Sanskrit morphological patterns, somehow recorded, that are not to be found in Panini’s Rules (or at least in their shorthand sutra version; I’m not sure if there’s some kind of longer version somewhere).
In other words, if the Rules (in either interpretation) can make mistakes, this means that there are (presumably many) forms that are known to be correct from an external source, and this external source (whatever it may be) is sufficiently definitive as to outrank the Rules themselves.

It’s a common theory that Sanskrit-as-we-know-it is effectively a conlang, and the usual version of this theory claims that its grammar is exactly Panini’s Rules. If true, this would mean that Panini’s Rules cannot make mistakes, by definition. Yet here we see that they do, in fact, make mistakes (and even the “corrected” version apparently still makes some, just not as many and/or not as bad ones).
It would be like saying that there are mistakes in Zamenhof’s Fundamento that do not in fact correctly represent Esperanto grammar. (…Which there kind of are, as it happens, so we’re not necessarily talking about an impossibility here. But it still sounds weird as ch*rp.)
David Marjanović says

December 18, 2022 at 1:46 pm

I’m not sure if there’s some kind of longer version somewhere

That’s what the commentaries try to be, and they’re all by other people. The original is unbelievably compressed in order to make it possible to learn the whole eight chapters by heart – it comes from an illiterate tradition.
David Marjanović says

December 18, 2022 at 2:42 pm

…or, as it’s put in Wikipedia, “it can be recited end-to-end in two hours”.
January First-of-May says

December 19, 2022 at 3:41 am

TIL (having finally found an online version of Rajpopat’s thesis) that Panini’s Rules are so long because so many of them are providing patterns for specific irregular roots.
Like, you know how Sanskrit has special feminine versions of the numerals for “three” and “four”? There’s a rule for that. And there’s a bunch of other rules for other not-quite-regular forms of those numerals.

For some reason I thought that there was some kind of version of Panini’s Rules that was short enough to fit on one page (and that this is where the infamous a a belonged). Maybe that’s just the phonology bit.

I do like the idea (proposed in the comments to one of the news posts about this, forgot which) that “right side of the word” is an incredibly modern way to think about it; as David Marjanovic correctly mentioned, Panini was working in an illiterate tradition, and for that matter AFAIK his successors would have been working in a tradition where you couldn’t necessarily guarantee that the direction of writing was left-to-right. (Aren’t some Indic scripts still right-to-left?)
So for Panini this would have been the later part of the word – which turns out to match the usual meaning of the relevant words a lot better.
January First-of-May says

December 19, 2022 at 4:59 am

that “right side of the word” is an incredibly modern way to think about it

Apparently (thanks to Language Log for informing me of this) one such comment had reached Rajpopat himself, who answered that indeed it is really the later part of the word, but that would make for cumbersome terminology, and he’s using RHS in his text because that’s what it means in modern terms and it would be a term that would be intelligible for his audience, but he really should have clarified better that this is not intended to be understood entirely literally.
(He also mentioned the trouble that the literal interpretation had with RTL scripts, which isn’t actually anything that came up in any of the comments I’ve seen except my own; kudos to him for realizing that!)

[EDIT: if he really lived in 4th century BC Gandhara, then if Panini knew any scripts he might well have only known RTL scripts – namely Kharosthi, and/or its predecessor Aramaic.]

The best phrasing of the new result that I’ve seen so far was by Rodger C on Language Log: “If I have it right, Panini said that it was a matter of “earlier” and “later” rules, and everyone’s been assuming he meant “in the grammar,” but he meant “in the word.””
John Cowan says

December 19, 2022 at 5:15 pm

There are no remaining RTL Indic scripts, unless you consider Arabic script “Indic” because it is used in India (lato sensu). Kharoṣṭhī script died out without descendants in the 3C, though it may have still been in use in Eastern Turkestan until the 7C. It was probably a direct descendant of Imperial Aramaic, whereas Brahmi script, the ancestor of all the other Indic scripts, may have owed something to Ethiopic script (which was LTR).

Update: The dissertation, for those who want to read it.
David Eddyshaw says

December 19, 2022 at 6:42 pm

Thanks, JC.

I’ve already discovered from it that I was quite wrong in supposing that “metarules” are not found in the Aṣṭādhyāyī itself. Far from the case, it seems. What Kātyāyana did was to try to interpret them (while adding some more of his own.)

This question of rule ordering is a hardy perennial of all sufficiently sophisticated grammatical systems. I’ve just had a nightmare vision that some Chomskyan might attempt to recast the Aṣṭādhyāyī in terms of Optimality Theory. That would, I think, usher in the End Times.

I did not know that Paul Kiparsky had addressed the question of rule ordering in the Aṣṭādhyāyī quite extensively. The universe may have had a narrow escape from the final informational cataclysm.
David Marjanović says

December 19, 2022 at 7:33 pm

I’ve just had a nightmare vision that some Chomskyan might attempt to recast the Aṣṭādhyāyī in terms of Optimality Theory.

I don’t think I can find any in Google or Google Scholar, though plenty of OT works try to trace the Theory back to the Aṣṭādhyāyī.
Lars Mathiesen (he/him/his) says

December 19, 2022 at 8:46 pm

This whole rewriting rule thing gives me flashbacks to the Revised Report on Algol 68. Hero, the rule you need is in another chapter!
AntC says

December 19, 2022 at 9:23 pm

@JC There are no remaining RTL Indic scripts, …

Are/were there any Boustrophedontic Indic scripts? Indeed how common is Boustrophedon generally?

I associate that word with the early computer lineprinters and teletypes: it took so long to return the printing head to LHS that it was quicker to reverse alternate lines.

RHS in his [Rajpopat’s] text …, but he really should have clarified better that this is not intended to be understood entirely literally.

The Aṣṭādhyāyī, composed in an era when oral composition and transmission was the norm, … [wp]

Oral transmission has no right/left, only earlier/later. Presumably Pāṇini’s immediate students understood him. Was there a time at which oral transmission ceased/students focused only on the strict text/the meta-meta-rule[**] got lost?

[**] That is, that in ‘in the event of a conflict …’, ‘later’ means in the word.
AntC says

December 19, 2022 at 9:32 pm

flashbacks to the Revised Report on Algol 68.

Wait. You mean Pāṇini invented Van Wijngaarden grammar? Presumably after he’d perfected BNF.
Lars Mathiesen (he/him/his) says

December 20, 2022 at 4:32 am

I wouldn’t put it past him, it seems he was a very clever man. The question is if the committee was clever enough to encode Sanskrit morphology in van Wijngaarden format. And which would have the most rules, Algol 68 or Sanskrit?
Lars Mathiesen (he/him/his) says

December 20, 2022 at 4:42 am

Also, line printers don’t return to LHS. That’s why they are line printers. (In ’84 or so the CS Institute got new printers that were maybe 5 times quicker than the Sperry Rand chain-and-hammers ones at the computing centre. But that was because they had an 1086 pin printhead. Kyoceras, maybe).
David Marjanović says

December 20, 2022 at 10:28 am

Are/were there any Boustrophedontic Indic scripts? Indeed how common is Boustrophedon generally?

I’ve only heard of this in early versions of the alphabet.

However, in China today, there are phenomena like writing on the side of a bus going from front to back, so you can read it as the bus passes you: left to right on the left side of the bus, right to left on the right side.
John Cowan says

December 27, 2022 at 11:00 am

Indeed how common is Boustrophedon generally?

Writing text to be read boustrophedon is quite rare; note that it’s typical to reflect the characters when you do this. I know of no script where boustrophedon is invariable except Easter Island monumental, where the letters are rotated 180 degrees for each line; presumably in practice it was the stone that was rotated. The direction of line progression is BTT, which is also unique.

Avoiuli script, used to write Raga, a language of Pentecost Island, Vanuatu, is a cursive designed to be written boustrophedon (most of the glyphs are symmetrical) but often written LTR. Monumental ogham is normally written BTT on the edge of a stone, but if the text is long enough it can cross the stone LTR and then be written downwards on another edge without reflection (manuscript ogham is ordinary LTR throughout).

Writing boustrophedon to be read in a fixed order is another matter, since it only affects the writer and not the reader. It’s always going to be specialized, since you need to anticipate the line breaks
rather than taking them in the natural positions. Carving on stone today is often done RTL, which is physically easier, even when the script is LTR.

writing on the side of a bus

Another example is writing on a road surface, which from the pedestrian or passenger viewpoint has BTT line progression but which the driver experiences in the correct order. Both of these are examples of moving (or apparently moving) text past a fixed gaze point.
Stu Clayton says

December 27, 2022 at 11:20 am

Carving on stone today is often done RTL, which is physically easier, even when the script is LTR.

For right-handers maybe.
John Cowan says

December 27, 2022 at 12:26 pm

Most people are right-handers, and I know of no reason why left-handers should be preferentially likely to become stone-carvers (unlike baseball players, say).
Athel Cornish-Bowden says

December 27, 2022 at 12:57 pm

Boustrophedon is not uncommon in small children just learning to write. We have a painting done by our daughter when she was at that stage. She wanted to sign it ISADORA, but she ran out of space at the end of the line, so she wrote

ISADO
AR

***

The first time I saw a lineprinter lineprinting I could barely believe the evidence of my eyes that it could accomplish such a complex task so rapidly. I think it printed two lines at a time and took about 0.5 s for each pair of lines.
Athel Cornish-Bowden says

December 27, 2022 at 1:13 pm

However, in China today, there are phenomena like writing on the side of a bus going from front to back, so you can read it as the bus passes you: left to right on the left side of the bus, right to left on the right side.

That sounds quite intelligent, but I’m reminded of something weirdly crazy that the plans in the carriages of the Marseilles Metro do. The scheme on the left-hand side of the carriage is the mirror image of the scheme on the right-hand side. (They don’t take it one stage crazier and use mirror writing for the text.) So, for example, one might see

——–*————————*—————–*———— *
Ste Marguerite — Rond Point du Prado — Perrier — Castellane

on one side, and

——- *———- *—————— * ———————– *
Castellane — Perrier — Rond Point du Prado — Ste Marguerite

on the other. (It’s more complicated than that because there are two lines that intersect at Castellane, and there are more than four stations in each.)

(Sorry: impossible to get the spacing right)
Y says

December 27, 2022 at 2:36 pm

Rongorongo was mostly written on wood, not stone. All the same, some of the pieces are big and not easy to rotate. Pure speculatively, they may have been meant for reciting by two people sitting opposite each other.
Lars Mathiesen (he/him/his) says

December 27, 2022 at 3:57 pm

I think the BTT order for road markings is a US thing. Words aren’t used nearly as much in Europe in the first place — the most common one is BUS (not BUS LANE, since it’s not supposed to be language dependent, so only one line). But in the rare cases where there are multiple lines, it’s usually TTB.
Athel Cornish-Bowden says

December 27, 2022 at 4:14 pm

I found when I was in Brazil a few years ago that “bus” is not as international word as I had thought. I wanted to check if a queue was the queue for the bus to the entrance of the national park at the Cataratas de Iguaçu. My spoken Portuguese is virtually non-existent, though I can read it, but I found that many Brazilians can understand Spanish, so I asked ¿Es la cola para el bus hasta la entrada?, and was answered with blank stares. I thought maybe “cola” was a chilenismo (it isn’t), but that wasn’t the problem, which was that no one had any idea what “bus” meant. Eventually someone realized that I meant “ônibus” and I was able to be assured that I was in the right queue.
Athel Cornish-Bowden says

December 27, 2022 at 4:45 pm

What does BTT stand for?

Does it refer to road markings like

NARROWS
ROAD

or

AHEAD
BRIDGE
LOW

that one used to see in England?

Ah, light dawns: does it stand for bottom-to-top?

The preceding comment can be deleted. It was the computer’s post, not mine.
David Marjanović says

December 27, 2022 at 5:06 pm

Words aren’t used nearly as much in Europe in the first place —

Both on the roads and on the traffic signs. Driving in the US was a slightly stressful experience: all that reading of surprisingly small text…
John Cowan says

January 19, 2023 at 9:53 pm

Bottom-to-top, yes.