The New Language of Mathematics.

As a quondam math major (though quondam was a long damn time ago), I can’t help but take an interest in Daniel S. Silver’s American Scientist account of what’s going on, linguistically, in that field. It starts by discussing a quote from Josiah Willard Gibbs, a 19th-century professor of mathematical physics at Yale: during a meeting about replacing mathematics requirements for a degree with foreign language courses, Gibbs declared: “Gentlemen, mathematics is a language”:

If mathematics is a language, then just as any ordinary language, such as French or Russian, does not rely on another one to be understood, so mathematics should be independent of ordinary languages. The idea does not seem so far-fetched when we consider musical notation, which is readable by trained musicians everywhere. If mathematics is a language, then we should be able to understand its ideas without the use of words.

He turns to Claude Shannon’s choice (prompted by John von Neumann) of the word entropy to describe his 1948 “beautiful and useful algebraic expression for a measure of average uncertainty in an information source”:

Like Shannon, many mathematicians agonize over the words they invent, hoping those they choose will survive for decades. Shannon’s choice proved to be brilliant for three reasons, two in addition to those proposed by von Neumann.

First, mathematicians enjoy the borrowed authority of words from science, especially from physics. Von Neumann was correct when he said that the use of entropy conferred an advantage on the speaker. Audience members either know the meaning of the term or else they feel that they should.

Second, Shannon’s appropriation of the term entropy provoked an insightful debate. Did his term in fact have a meaningful connection with statistical mechanics? The debate has been productive. Today Clausius’s entropy is regarded by many as a special case of Shannon’s idea.

Third, the wide popularity of Shannon’s information theory has been helped by his use of a word that is recognized, if not truly understood, by everyone. Its associations with disorder in everyday life evoke a sympathetic response, much like another popular mathematical word, chaos.

There’s a good account of how chaos got popularized; then he turns to poorly chosen words and phrases like Descartes’ imaginary number and John Colson’s Witch of Agnesi (Colson, whose Italian was minimal, misunderstood Maria Agnesi’s “la curva…dicesi la Versiera”). He concludes with a discussion of symbols:

The German mathematician and philosopher Gottlob Frege thought that he had found a way to communicate with just symbols. In 1879 he published Begriffsschrift, an 88-page booklet in which the quantifiers of formal logic first appeared. Frege described his work as “a formula language, modeled on that of arithmetic, of pure thought.” Diagrams stretched not only across its pages but up and down them as well. […]

Mathematicians’ reluctance to accept images in place of words has softened but not vanished. It is a common and uncomfortable experience for someone presenting a pictorial argument to hear a skeptic ask, “Is that really a proof?” Predrag Cvitanović of Georgia Institute of Technology knows the experience. He recalls the taunt of a colleague staring at the pictograms that he had left on a blackboard: “What are these birdtracks?” Cvitanović liked the ornithological term so much that he decided to adopt it for the name of his new notation.

So what are these birdtracks? Briefly, they are combinations of dots, lines, boxes, arrows, and other symbols, letters of a diagrammatic language for a type of algebra. […]

Birdtracks and planar algebras are picture-languages that draw their initial inspiration from physics. Yet another is quon language, first introduced in December 2016 on, a repository of online research-paper preprints that is used by scientists throughout the world. Quon language, created by Harvard University mathematicians Zhengwei Liu, Alex Wozniakowski, and Arthur M. Jaffe, is derived from three-dimensional pictorial representations of particlelike excitations and transformations that act on them. An earlier, simpler version has much in common with the languages of Cvitanović and Kauffman.

According to its inventors, the quon language can do more than aid in the study of quantum information. It is also a language for algebra and topology, with the ability to prove theorems in both subjects. In an interview with the Harvard Gazette, Jaffe remarked, “So this pictorial language for mathematics can give you insights and a way of thinking that you don’t see in the usual, algebraic way of approaching mathematics.” He added that “It turns out one picture is worth 1,000 symbols.”

I’m guessing quon is derived from quantum, but it’s certainly derived oddly, and I wish something had been said about it.


  1. I noticed in the article “Lie’s”, rather than “Lies”, as shorthand for Lie groups (named for Sophus Lie).

    The one mathematical contribution to English grammar, modulo, has never made it into the speech of commoners.

    The original quon paper says “Here we formulate a 3D topological picture-language that we call the quon language—suggesting quantum particles.” (And see our discussion of -tron and its kin.)

  2. I suppose quon is cue-on as opposed to kwan.

  3. Hypothesis: Quon and similar pictorial languages will be intuitively understood only by those young mathematicians who grew up communicating through emojis.

  4. Prof. Jaffe is 80.

  5. On the first day of Differential Equations, Prof. Gian-Carlo Rota said, “Differential equations are the language of science.”

    Separately, I wonder a bit whether Silver really understands the history of “entropy.” If, when Shannon had chosen to use that word, the previous state of the art had really been that of Clausius and Kelvin, it really would have been hugely controversial whether the novel usage was appropriate. However, there is the absolutely key intermediate step of Boltzmann in between Clausius and Shannon. It was only due to Boltzmann’s work that anybody realized there was a connection between entropy and disorder. Kelvin and Clausius only made the connection between the abstraction they defined and the unavoidable loss of usable energy (“exergy”) as the universe moves toward thermal death. Boltzmann’s invention of statistical mechanics gave entropy a concrete definition (S = k log W, as it says on his tomb) in terms of the degree of microscopic disorder corresponding to the known macroscopic characteristics of a system. And when Shannon defined informational entropy, it was obvious that it and Boltzmann’s entropy were closely related. Mathematically, the definitions may be cast in nearly identical forms; the only difference is whether the probability distribution involved is continuous or discrete. Had Maxwell, Boltzmann, or Gibbs lived to see Shannon’s work, they would have immediately recognized it as a logical extension of nineteenth-century statistical mechanics.

    I recorded some further meandering thoughts about the history of entropy back in 2011, in the comments on my blog post on Asimov’s “The Last Question”:

  6. What is this “grammar” which no mathematical words contribute to except “modulo”? Perhaps a word’s part of speech (word class) determines whether or not its usage contributes to grammar? I’ve come across “modulo” used metaphorically to mean something like “apart from; disregarding”, but then you need a certain amount of mathematical knowledge to make the metaphorical connection. How about “plus”, both as noun and as conjunction?

    I take “quon” as being pronounced as the first syllable of “quantum”, not as “cue-on” (“quon” has no evidence of a yod). I too think that “quon” is oddly formed. “quan” would have duly indicated the LOT vowel because it’s customary for “a” to be pronounced so after /w/ (except before a velar).

  7. Indeed, plus and minus came into modern English from mathematical usage (OED: “The prepositional use [of plus], from which all the other English uses developed, did not exist in Latin of any period. It probably originated in the commercial language of the Middle Ages.”)

    I vote for cue-on, based on the pronunciation of qubit as cue-bit, and rhyming with muon.

  8. The arxiv paper says “A quon is a composite that acts as a particle.”

    So I agree with “rhyming with muon.” The -on is as in lepton, boson, as well as electron, neutron.

    But the interwebs are curiously unhelpful.

    Apparently quons are species of qudits, and there’s a reference to anyons. The opportunities for Cupertinos abound!

  9. Athel Cornish-Bowden says

    Josiah Willard Gibbs was modest and self-effacing, and his writings are very hard to read: if it wasn’t for Maxwell (I think it was), who converted Gibbs’s ideas into ones that others could understand, he would probably be forgotten today. He wasn’t very well known even in his own university. The President of Yale thought that the university needed a distinguished thermodynamicist, and asked around in Europe for a suitable candidate who could be enticed to come to Yale. He was surprised to be told that Yale already had the greatest thermodynamicist in the world.

  10. Gibbs is quite famous nowadays– besides inventing statistical thermodynamics (with Boltzmann), he developed the symbolism now used for vector calculus and is one of the founders of Bayesian probability. The anecdote about Gibbs and Yale is generally told to illustrate Yale’s neglect of the sciences. The punch line, as I heard it, is

    European Experts: Well, what about Gibbs?
    Yale: Gibbs?

    In any event, he’s buried in Yale’s Grove Street Cemetery.

  11. Compare also cluon, the quantum of inspiration.

  12. Forms like “qubit” rub me the wrong way. It seems inconsiderate to expect people to learn not only your fun neologism but also its eccentric spelling. To me “q-bit” is an obviously superior name.

  13. I don’t like “qubit,” but after twenty years I’ve gotten used to it. However, “q-bit” is much worse, because it violates other conventions for mathematics and physics terminology. A “q-bit” would have to be something related to a mathematical object q, like a q-analog or a t-test.

  14. I suppose quon is cue-on as opposed to kwan.

    Good lord, that sounded silly to me when you first said it but after reading other comments it seems like a distinct possibility. Surely someone knows how it’s said?!

    I’ve come across “modulo” used metaphorically to mean something like “apart from; disregarding”, but then you need a certain amount of mathematical knowledge to make the metaphorical connection.

    “Modulo” at LH. (Has it really been over a decade?)

    Forms like “qubit” rub me the wrong way. It seems inconsiderate to expect people to learn not only your fun neologism but also its eccentric spelling. To me “q-bit” is an obviously superior name.

    I am in complete agreement, and if quon is really “cue-on,” that’s even worse.

  15. David Marjanović says

    Agreeing on q-bit. It all started with the backformation barbeque.

    my blog post on Asimov’s “The Last Question”

    I personally don’t mind that the font size of the comments is tiny. But. Man. Light gray text on dark gray and middle gray backgrounds? I had to turn the screen brightness so far up that the bottom of the screen became rather blinding. Whoever designed that WordPress theme seems to have had no idea that you can turn the backlighting of the screen completely off without the screen becoming anywhere near black; “0% brightness” is grossly misnamed.

  16. They actually changed the color scheme slightly on the blog template I use a while back. (But why? Who knows? Maybe the template was mostly used by goths who complained the grays weren’t dark and gritty enough.) Since I haven’t been blogging for a while, I haven’t gotten around to fixing the problem.

  17. N. David Mermin writes Qbit, which nicely contrasts with Cbit ‘classical bit’, notably in his amazing paper on teaching quantum mechanics for and to computer scientists in a mere four hours. After that, you won’t be able to build a quantum computer, but you will definitely know enough QM to program one if you have one. He later expanded this paper into a book, Quantum Computer Science: An Introduction, which I have not read.

  18. Mermin taught my junior-level course in solid-state physics (many years ago) and unless his teaching style has radically changed, I’d strongly recommend reading the book rather than watching his lectures. I used to describe his teaching as ‘stuttering in complete sentences’. Mysteriously entertaining, but not really elucidative.

  19. Athel Cornish-Bowden says

    Kelvin and Clausius only made the connection between the abstraction they defined and the unavoidable loss of usable energy (“exergy”) as the universe moves toward thermal death.

    Leon Cooper had an opinion about the word “entropy”, which I’ll try to quote from memory, as I don’t have the book at hand. After telling us that Clausius wanted a word that would sound a bit like energy and mean transformation, and that by taking it from Greek it would mean the same in all languages, Cooper commented that he had succeeded in this way in finding a word that would mean the same for everybody: nothing.

  20. Athel Cornish-Bowden says

    I had metallurgy lectures from William Hume-Rothery, who was stone deaf. His lecturing voice was so weird that it was difficult to remember anything other than his voice. He always start each lecture (except, presumably, the first) with the words “In the last lecture I …”, spoken as far from a monotone as it’s possible to get, but after that it was just a haze. Fortunately I didn’t need to know much metallurgy.

  21. Samuel Eilenberg had interesting things to say about neonymy in mathematics. He was particularly proud of his creation ‘exact’: ‘exact sequence’, ‘exactness’, ‘left exact’, ‘right exact’ et al. He seized upon that word because it reflected back to exact differentials – de Rham theory becoming thereby a linguistic nexus between 19-th and 20-th century mathematics. For a long time most borrowings into mathematics were classical (plethysm, homology, module). A certain faction now goes in for whimsy (quandle, nim-sum). Neonymy is difficult: all words carry associations, and you do not want the wrong ones. Names should be verbifiable (e.g. ‘sheafify’) without giving too much offence to the linguistically sensitive.

  22. Josiah Willard Gibbs was modest and self-effacing, and his writings are very hard to read

    I came across a story that after Gibbs’ work began to gain a reputation in Europe, Lord Rayleigh wrote to him to ask if he would be willing to write an amplified version of his work that would be easier for general physicists and chemists to understand. Gibbs wrote back “I myself had come to the conclusion that the fault was that it was too long.”

  23. The third point from Silver’s story that you quote makes me wonder. It implies that Shannon chose ‘entropy’ because it was a science word that had gained some public currency. But is that true? My feeling is that entropy began to take off in public consciousness only after Shannon added to its meaning. Before that it was a fairly abstruse physics term.

    I don’t really know, though.

  24. Good question; I don’t either.

  25. As for ‘entropy’, I dread seeing it in a non-technical discussion. It’s a subtle concept– or, rather several subtle and interwoven concepts that all seem to turn out to mean (sort of) the same thing– what’s certainly true is that many scientists have made careers out of arguing about it. There’s an old joke that the way to rise to the top of the Science Citation Index is to write a short paper on the Second Law that has a subtle error in it.

    Information theory, Bayesian inference, statistical thermodynamics have all been the subject of vehement back-and-forth arguments. At the moment, I’d bet on the Bayesians. But that’s as far as I’ll go.

  26. Shannon chose “entropy” because what he was defining was clearly the same object that Boltzmann had defined. The only difference was that Boltzmann dealt with continuous probability distributions (for the positions and momenta of particles in a gas, principally), while Shannon mostly dealt with discrete probability distributions, representing logic states. What debate there was about whether the name was appropriate fell into two categories. Some was highly technical, dealing with the precisely what it meant for entropy to be a quantification of “missing information” about a system.

    The other component of the debate, which was louder and got more attention, was between people (like Shannon) who understood the statistical definition entropy, and people who did not. In the middle of the twentieth century, statistical mechanics was still considered an esoteric topic even by many physicists, and almost nobody outside physics knew much of anything about it. So it was forgivable that many people who were interested in Shannon’s work on communication did not understand the existing background on the subject of entropy. Even today, there are a lot more people who work with entropy than who understand it fully; thermodynamics has applications all over science and engineering, while statistical mechanics is still much more of a niche topic. It was less forgivable that people ignorant of the subject spent so much time debating and bloviating about topics that had been figured out by Boltzmann, Gibbs, and others decades before.

    And the more I write about this, the more annoyed I am that Silver saw fit to write about the history of this neologism without apparently understanding it himself.

  27. ktschwarz says

    @Brett *stands on chair applauding*

    I can only add the footnote that nobody really knows why Clausius called entropy ‘S’, but it was an excellent choice since there’s not much competition there, compared to more overloaded letters such as ‘E’ and ‘Q’.

  28. David Marjanović says

    Most letters are so overloaded that I’ve often wondered if it’d help if we used Chinese characters instead. But those tend to be way more complex than necessary for that task, and correspondingly hard to remember…

  29. Well, clearly we need an entirely new character set for the task, preferably one with no resemblance to any already existing characters.

  30. January First-of-May says

    Most letters are so overloaded that I’ve often wondered if it’d help if we used Chinese characters instead. But those tend to be way more complex than necessary for that task, and correspondingly hard to remember…

    There’s still most of the Hebrew alphabet, as well as much of the Cyrillic alphabet. Next up is Armenian, Georgian, and the assorted runes. Plus Phoenician and South Arabian.
    We might eventually need to resort to syllabic scripts – Cherokee, the kanas, Linear B…

    Still an awful lot of slack until we need Chinese – even if we limit it to scripts likely to be familiar to non-linguists (i.e. Cyrillic, the kanas, perhaps some runes, and maybe Armenian and/or Georgian).

    Back when the Chihiro numbers came up on Wikipedia – they were briefly on DYK before the article was pulled as the hoax it was – I liked that they could (supposedly) be denoted as チ(n), using the katakana symbol for “chi”, because the alternate notation C(n) was way too overloaded (even within large number theory googology).

  31. David Marjanović says

    It’s definitely way past time for more Cyrillic.

  32. Let Ћ = 2Ќ…

  33. When Gibbs appeared as an expert witness, the opposing lawyer asked him how he knew something or other; Gibbs replied that on this particular subject he was the greatest living expert. Exit lawyer.

    Now Gibbs was well-known to be the most modest of men. When his friends asked him how he had managed to call himself the greatest living expert in public, he replied: “Gentlemen, I had to; I was on oath.”

  34. Great story!

  35. dainichi says

    I wonder how different mathematical notation (and by extension mainstream programming languages) would have looked if it had been invented by speakers of left-branching languages, or at least languages where O comes before V.

    Prefix functional notation (function coming before argument, like in “f(x)”) only really makes sense because we’re used to it. When you have several functions strung together, with usual notation you usually have to read them backwards to understand what is going on.

  36. marie-lucie says

    languages where O comes before V.

    Latin is one of those, although not 100%.

  37. If the inventor of operator notation had used an SOV language, perhaps they’d have chosen postfix notation (e.g. a b + instead of a + b). Then expressions would be simpler. There’d be no need for parentheses for parsing: (a+b)(c-d) would become a b + c d – *. Parsing expressions would be simpler, too (no need for BODMAS/PEMDAS let alone the byzantine complexity of operator precedences in C).

  38. Anybody remember ‘Reverse Polish’ calculators?

    And there’s at least one reverse Polish programming language: Forth.

  39. I myself am one of the implementers of Joy, a language in the Forth tradition.

  40. > Anybody remember ‘Reverse Polish’ calculators?

    Yeah, I did my first programming on an HP15C. Stack based programming languages are interesting, but when reading programs in them, it’s often more confusing to figure out what is applied to what, unless you know the arity of functions/operators by heart. Traditional notation with parentheses around function arguments makes it clearer, although it’s more verbose.

    > Latin is one of those [languages with O before V]

    Hm… that’s true. So in the case of notation like “f(x)”, “f of x”, maybe it has more to do with the position of nominal modifiers. Had the notation been invented by speakers of languages where something like “x’s f” is the only option, it might have looked more like (x)f.

  41. There is, of course, the arrow notation for functions, which puts the argument first. However, it’s more cumbersome the way it is actually written than to use Euler’s notation. So its generally only used to depict webs (as opposed to chains) of nested functions, such as with Eilenberg’s exact sequences. The “zig-zag lemma” (A short exact sequence of chain complexes induces a long exact sequence on homology) gets its name from the way you follow the arrows around to form the long exact sequence. Another lemma in algebraic topology that is usually given in arrow notation is the five lemma (since it says that if four homomorphism arrows in a certain exact arrow diagram are actually isomorphisms, then a fifth one is an isomorphism as well).

  42. John Cowan says

    Most modern programming languages use a.b for the component of a named “b”, but in Cobol and Algol 68 it was “b of a”.

  43. January First-of-May says

    Well, clearly we need an entirely new character set for the task, preferably one with no resemblance to any already existing characters.

    Upon seeing this thread again, I was immediately reminded of Ancient Egyptian Algebra.

    (There’s a different pic of that scene going around that does use actual Ancient Egyptian symbols, but the symbols on this pic look like nothing I recognize.)

  44. Well, many programming languages now allow emoji in either operators or identifiers.

Speak Your Mind