Mumbling as Data Compression.

Julie Sedivy has an interesting post at Nautilus:

Far from being a symptom of linguistic indifference or moral decay, dropping or reducing sounds displays an underlying logic similar to the data-compression schemes that are used to create MP3s and JPEGs. These algorithms trim down the space needed to digitally store sounds and images by throwing out information that is redundant or doesn’t add much to our perceptual experience—for example, tossing out data at sound frequencies we can’t hear, or not bothering to encode slight gradations of color that are hard to see. The idea is to keep only the information that has the greatest impact.

Mumbling—or phonetic reduction, as language scientists prefer to call it—appears to follow a similar strategy. Not all words are equally likely to be reduced. In speech, you’re more likely to reduce common words like fine than uncommon words like tine. You’re also more likely to reduce words if they’re predictable in the context, so that the word fine would be pronounced less distinctly in a sentence like “You’re going to be just fine” than “The last word in this sentence is fine.” This suggests that speakers, at a purely unconscious level, strategically preserve information when it’s needed, but often leave it out when it doesn’t offer much communicative payoff. Speaking is an effortful, cognitively expensive activity, and by streamlining where they can, speakers may ultimately produce better-designed, more fluent sentences. […]

The notion of strategic laziness, in which effort and informational value are judiciously balanced against each other, scales up beyond individual speakers to entire languages, helping to explain why they have certain properties. For example, it offers some insight into why languages tolerate massive amounts of ambiguity in their vocabularies: Speakers can recycle easy-to-pronounce words and phrases to take on multiple meanings, in situations where listeners can easily recover the speaker’s intent. It has also been invoked to explain the fact that across languages, the most common words tend to be short, carrying minimal amounts of phonetic information, and to account for why languages adopt certain word orders.

There are links to papers backing up various points mentioned, and a nice zinger at the end.


  1. There’s an interesting ‘dual’ version of data compression that’s become fashionable lately in signal processing– ‘compressive sensing’. In compressive sensing, a sensor detects only a compressed version of a signal rather than the full uncompressed version. This simplifies the work a sensor has to do by a large factor, since you can search a much lower dimensional space for the information that you are looking for. In the linguistic context, this raises the possibility that both sides of a conversation, both the speaker and the listener, are compressing– leading to the possibility that the full uncompressed signal never actually appears in the back-and-forth information transfer.

    For the record, here’s a link to an incomprehensible Wikiipedia article on compressive sensing:

  2. and a nice zinger at the end

    Reminds me of the first time I visited the southern states, during March break several decades ago. Whenever I ordered ‘tea’ in a restaurant, the waitress invariably asked whether I wanted ‘hot tea.’

    Not to be confused with Red Zinger tea, introduced about the same time.

  3. Matt,
    “There’s an interesting ‘dual’ version of data compression that’s become fashionable lately in signal processing– ‘compressive sensing’. In compressive sensing, a sensor detects only a compressed version of a signal rather than the full uncompressed version. ”

    The same thing happens in language, when a speaker of one language cannot distinguish the sounds of another because his language does not recognize that distinction.

  4. Exhibit A, the french language.

  5. “Reminds me of the first time I visited the southern states, during March break several decades ago. Whenever I ordered ‘tea’ in a restaurant, the waitress invariably asked whether I wanted ‘hot tea.’”

    Huh? I am a native speaker SoAmE and I am sorry, but I don’t understand.

  6. Some languages make do with just three or four distinct words for color; for example, the Lele language, spoken by tens of thousands of people in Chad, uses a single word to encompass yellow, green, and blue. Languages with minimalist color vocabularies tend to be spoken in pre-industrial societies, where there are very few manufactured objects to which color has been artificially applied.

    Is this correct? I thought that the difference was that minimal color word societies tend to be hunter-gatherer, while complex agricultural societies develop wider color vocabularies. Putting the difference at industrialization means that most color-specific words are relatively recent coinages.

  7. Paul Ogden: Okay, sorry I responded before reading the article. Yeah, ice tea is understood as the drink one would get. The only question nowadays is “sweet or unsweet.”

  8. J. W. Brewer says

    Yeah, what fisheyed said. Or, at least, if you look at the list of eleven “basic” color terms found in English at, it’s been stable since maybe the 14th century (“orange” was I believe the last to arrive). Industrialization gave us e.g. “magenta,” but that’s not what the usual debate is about.

  9. George Grady says

    The OED only has “pink” as the color going back to 1669, named after the flower, whereas citations with “orange” as the color go back to 1557. I didn’t expect that.

  10. David Eddyshaw says

    “Hunter-gatherer” is overstating things. Lots of West African languages have three basic colour terms, and they’re mostly agricultural societies, not hunter-gatherers. Some of them have centuries-old complex near-feudal setups with longstanding military aristocracies, to say nothing of histories of big land empires and widespread literacy in Arabic.

  11. @MattF: The Wikipedia article is very detailed and technical, but you only need to understand this part: “In compressed sensing, one adds the constraint of sparsity, allowing only solutions which have a small number of nonzero coefficients.”

    In short, a practical application of Occam’s Razor — given an ambiguous signal, find the simplest explanation, for some useful definition of simple. The rest of the article is about giving the idea a mathematical foundation and applying it to signal processing.

    Historical linguistics is an application of compressed sensing in a wider sense (no mathematical scaffolding), and the difference between any two versions of PIE (for instance) is a consequence of a) how lossy an input was used b) disagreement on how to define simple.

  12. It bothers me that any Wikipedia article having to do with mathematics is essentially useless to anyone but a mathematician. An article in any other specialized field will usually start off with a general sort of explanation a layman can grasp, followed by the jargon-filled details for those who can use them, but mathematicians (or at least those who take up the mantle of Wikipedia editor) seem to be contemptuous of anything a layman might be able to understand and ruthlessly edit for maximum jargonicity.

  13. LH: I agree, but it’s very hard to be both correct and clear in writing mathematical text– and in a large projects like Wikipedia you will settle, generally, for correct. That said, it’s also fair to say that mathematicians are averse to explaining what they’re actually talking about.

  14. Many years ago now, I tried rewriting a few articles about mathematical topics to make them both accessible to laymen and mathematically correct. I was unable to do so. The simple difficulty of the undertaking was part of it, but the biggest problem was the other Wikipedia editors. People kept inserting inaccurate examples and explanations, based on their own misunderstandings of the topics. Years later, once these kinds of articles became impenetrable to people without certain graduate-level backgrounds, these kinds of intrusions seemed to have been driven off. (Or the added inaccuracies became much easier for the people maintaining the articles to find and prune away.)

  15. The thing is that it’s impossible to leap directly from ignorance to graduate-level knowledge without passing through several levels of “close enough for government work” approximations. The introductory paragraph of any article should give a general sense of what it’s about, with whatever inevitable simplifications and distortions involved and “see section X below for clarification” where necessary. The overly precision-conscious editors involved are pulling up the ladder behind them. “Sorry, if you want to get even the faintest idea of what we’re talking about you’ll have to get a PhD at MIT.”

  16. Stefan Holm says

    In the aftermath ow WWII it was found desirable that ’the two worlds’, science and humanities, must come together. Those physicists and mathematicians who had constructed so many deadly weapons and finally the nuclear bomb must be introduced to the world of values, beauty, meaning of life etc. And on the other hand the humanists and lawyers must get a glimpse of the scientific world.

    So in 1967 I in my home town belonged to the first cohort studying science at upper secondary school that was obliged to take classes in philosophy, litterature, psychology, religion and history of art and music. My very good friends among the ‘humanists’ got the corresponding basic orientation in science and mathematics.

    Thanks God, I say. Without this bridging idea I maybe wouldn’t have been able to appreciate (and participate in) this eminent blog.. It however turned out to be harder the other way around. My dear long time friends just had a harder time to grasp the world of mathematics. May I to illustrate this bother you with three quiz questions I’ve frequently used at parties?

    1) In Sweden there is a rift lake called Vättern. It’s about 130 kilometers long. Imagine that Peter at the north end and Mary at the south end decided to stretch a rope between themselves. (I know it’s physically impossible but you could send a laser beam all the way). Anyway, since the earth is round (you believe in that?) – the rope would be under the water surface on its way (the rope is straight, the earth is round). How far under the water surface would the rope be at the middle of lake Vättern?

    2) The circumference of the earth is approximately 40,000 kilometers. That is the length of a rope you would need to stretch all the way around the globe (forget about mountains and valleys – it actually doesn’t change the quiz). Now say that you want to stretch the rope one meter above the surface of the earth. How much additional rope would you have to buy?

    3) If you flip a coin the chance is 50-50 that you would get a head or a tail. Now, if you flip a thousand coins – what is the probability that 400 or less (40%) would turn up as tails?

    You out there who know about geometry and probabilites – hold your horses and let us see what our human gut feeling tells us. I for one don’t agree with the inscription above the entrance of Plato’s Akademia: MÉDEIS AGEOMÉTRETOS EISITO – non-mathematicians are not allowed. But I also know that if our brains were basically mathematically hard wired every betting company on this earth would have to close down.

  17. Rodger C says

    2) 3.14159 meters.

  18. Lots of West African languages have three basic colour terms, and they’re mostly agricultural societies

    I was hazily imagining that farmers would need words for green and yellow to talk about crops growing and getting ripe. But Geoffrey Sampson points out that all they need is words for immature and ripe; they don’t need to separate out the property of color from age, edibility, etc. Case in point, the Philippine language Hanunóo has two words that are not abstract ‘red’ and ‘green’, but rather bundles of properties ‘red/dry/withered’ and ‘green/wet/fresh’, so that a wet, brown section of freshly cut bamboo is called by the ‘wet/fresh’ term even though its hue is closer to red than green. (See leoboiko’s comment.)

    The point when a language develops an abstract color lexicon is not when people start to farm but when they develop paints and dyes, Sampson says. Nobody needs a word for “blue” until they can dye stuff blue. Of course, industrialization is a huge boost not only to the number of color terms, but to how much people use them and how early children learn them. Children with Crayolas are learning color vocabulary faster than children in the 19th century.

    The WALS map shows several languages in West Africa with 3 basic color terms (Dan, Wobe, Bété, Nafaanra); as best as I can glean from Wikipedia, those peoples have arts of wood and metal sculpture, but not dyeing and painting. But other West African peoples are known for kente cloth and its intricate woven color patterns. What are their color terms, and does their history correspond with the history of dyeing? That would make an excellent test for the dye theory, similar to what Sampson did with Old Chinese — he says its color terms are much more like English than Homer’s.

  19. David Eddyshaw says

    Hausa has three basic colour terms, and the Hausa are famous dyers.

    There’s something of a misconception in this business of “basic colour terms”: if a language has three basic colour terms, all that means is that any colour can be allocated to one of the three words without native speakers feeling that there has been a mistake. It doesn’t necessarily mean that there aren’t also other perfectly standard colour terms, which may or may not be further analysable. (This also means that you can quite easily make a mistake – as I think WALS has done in some cases – about how many truly basic colour terms there are. You can’t get the answer by just totting up the total number of unanalysable colour words: that is very likely to give you too high an answer.)

    In West Africa, where there are a lot of three-basic-colour languages, there are also a lot of standard expressions for particular colours which translate directly across languages: “like ash” for “grey”, “like grass” for “green”, and so forth.

  20. David Eddyshaw says

    The Kusaasi share the same traditional male dress (banaa in Kusaal, “fugu shirt” in local English) with their neighbours the Gurunsi, Mossi, Mamprussi and Dagomba, but they are traditionally dyed differently. The Kusaasi version is blue, in contrast to Mossi (IIRC) green, but there are no native colour-words specifically for “blue” or “green” in Kusaal or Mooré.

  21. Roger C (from 5 years ago!), it’s twice that. There’s plenty of blame to go around.

  22. January First-of-May says

    The answer to number 1 is “the lake isn’t deep enough”. My initial idea was a huge overestimate (3 km, IIRC), and careful calculations revised the number far downward, but the lake still isn’t quite deep enough (its maximum depth is 128 meters, says Wikipedia).
    [I wonder if Baikal is deep enough for a line along its long axis…]

    That said, you could send a neutrino beam if you had a narrow-beam neutrino source at one end and a neutrino detector at the other; this had actually been done (though not at Vättern specifically), and the results agree with the geometry.

  23. Trond Engen says

    Lake Baikal would have to be almost 8 km deep, and the depths along its axis would have to follow the shape of the geoid (or, to pass my calculation, the perfect sphere with a circumference of 40 000 km).

  24. David Eddyshaw said: Hausa are famous dyers … any colour can be allocated to one of the three words without native speakers feeling that there has been a mistake.

    (Sorry, meant to answer this long back!) Thanks very much, and you should tell Sampson, since it completely knocks down his pigment theory.

    I looked up old Hausa-English dictionaries and eventually stumbled onto this, in Vocabulary of the Haussa Language by James Frederick Schön (1843):

    blue sinni (doubtful) [not in modern dictionaries]. Baki is more frequently used. —It does not appear that they observe a difference between blue and black; at all events they have not learned to express the distinction.
    yellow shĭa, or sha [ in current orthography]; properly, “red”. The Haussas appear not to make other distinctions in colour than white, black, and red. Baki rua, they say, for “Blue water.” —Fari rua, “Clear water.” —Sha tufa, “Yellow cloth.”

    1843! Decades before Gladstone and Geiger got started, and they couldn’t have asked for a better independent corroboration!

    you can quite easily make a mistake – as I think WALS has done in some cases – about how many truly basic colour terms there are. You can’t get the answer by just totting up the total number of unanalysable colour words: that is very likely to give you too high an answer.

    Berlin & Kay (1969) did make that mistake with Hausa, classifying it as having 6 color terms by pulling words from a 1925 English-Hausa dictionary translated as ‘green’, ‘yellow’, and ‘blue’ (the ‘yellow’ and ‘blue’ words are transparently related to dyes). They didn’t ask the right question: they didn’t ask Hausa speakers whether those were included in the 3 basic terms. It goes to show the limitations of working from dictionaries. In fact, there are hints in the old dictionaries themselves that the Hausa terms have wider ranges than the English ones: even though they gloss fari and ja simply as ‘white’ and ‘red’, they also translate ‘brass’ as farin ḳarifi or jan ḳarifi, literally ‘white iron’ or ‘red iron’.

    WALS, however, uses data from Kay’s World Color Survey, which tried to avoid that mistake: the instructions to fieldworkers specify that a color term isn’t basic if it’s included in any other color term.

  25. David Marjanović says

    blue sinni


  26. Yes, further evidence of Slavo-Chadic.

  27. Dear Esteemed Hat, (back upthread circa 2015, which I’ve only just now seen)

    — It bothers me that any Wikipedia article having to do with mathematics is essentially useless to anyone but a mathematician —

    Now in my later 70s, I find it increasingly hard to write in anything but LaTeX. I’m afraid that may sound preposterous, but it makes some sense to think of math as a register, perhaps like song, that makes its own serious demands on transcription. Hawking says, every equation halves your audience, but so be it: a medium that doesn’t allow equations is like an alphabet without vowels, like doing long division in roman numerals; it’s cumbersome, imprecise, and culturally illiterate, in the most natural, broadest, sense.

    There’s a notion of `languages for special purposes’ – jargons, thieves’ cant, cf eg

    It’s all culture.

  28. I’m not sure what your point is. I did not say mathematicians should not use equations; that would be absurd. I was saying it should be possible to write an introduction to a mathematical article that would be enlightening for non-mathematicians while not being seriously inaccurate. That is the case with every other field I can think of; it cannot be literally impossible for mathematics. It has to be that mathematicians (at least the ones who “own” Wikipedia articles) are uninterested in communicating with anybody but mathematicians; they couldn’t care less if anyone else can understand or use the articles. People know what equations are — that’s a red herring.

  29. I hesitated to write about this at all, because the point is a hard sell, but decided to bite the bullet. I am impressed by the attempts of the writers at Quanta magazine, for example, to write conscientiously about serious math in non-technical language, but to be frank the result is often like representing Greek mythology (Mercury, Venus, Mars and friends) as theoretical astronomy. Topologists talk about spaces, for example knots

    and know what they mean. Science writers seem lately to be afraid to say `space’ and nowadays talk about `shapes’ instead; a metaphor without content, the beginning of an infinite regress away from the issue. It’s not that mathematicians are uninterested in communicating to anyone but mathematicians – it’s more like asking to learn for example Russian, from a course in which everything is in English.

    I’m afraid this sounds trollish, and I don’t mean it to be; the problem is perhaps that mathematics is not a human language, but God’s, and it’s presumptuous to ask Her to speak to us in something like Pira\~ha.

  30. That doesn’t sound trollish, it sounds like mystical mumbo-jumbo of the sort one would think mathematicians would be the last to indulge in. When elementary-school teachers explain numbers to kids, they do not start with κ-saturated extensions, they show two apples and three apples and put them together to get five apples (vel sim.). Advanced math does not make that untrue, it just adds layers of complexity. Einstein didn’t say “If you can’t explain it simply, you don’t understand it well enough,” but it’s still a valuable idea. It looks for all the world as if mathematicians, having gotten to the top of the mountain, are pulling up the ladder to keep out possible competitors. Not saying that’s true, of course — I’m sure mathematicians would like everyone to understand mathematics — just saying the results are much the same. (I suspect what’s actually going on is that the kind of people attracted to mathematics tend to be the kind of focused, detail-oriented people who are not good at communicating their understanding to others.)

  31. When I was a punk kid my father asked me to explain general relativity to him. I could do that I said but you wouldn’t understand it. Much later I realized that a better answer would have been, you have to sign up for the course, the short version would take about six weeks and you can’t not do the homework.

    The book of nature is an open book says Galileo, but it seems to be written in mystical mumbo-jumbo…

  32. It’s not just mathematical notation. If you look up the definitional first section of a Wikipedia article on any number of mathematical concepts, you’ll be led down the path of abstruse definitions leading to other abstruse definitions, all in plain text with nary a formula. For example,

    K-theory is, roughly speaking, the study of a ring generated by vector bundles over a topological space or scheme. In algebraic topology, it is a cohomology theory known as topological K-theory. In algebra and algebraic geometry, it is referred to as algebraic K-theory. It is also a fundamental tool in the field of operator algebras. It can be seen as the study of certain kinds of invariants of large matrices.

    It’s hard to explain with apples.

  33. When I was a punk kid my father asked me to explain general relativity to him.

    It depends what you mean by ‘explain.’ You can explain the essence of GR in a qualitative way by talking about trajectories in curved space and so on. If you want precise understanding then of course you need to know the math. But as someone who spent much of his life writing about physics and other sciences for non-specialized audiences, I can tell you there’s a lot you can do with appropriate concepts and imagery.

    But mathematics is different. The higher levels take you into realms of abstraction that are far removed from counting apples. That’s why I was never much of a mathematician. I could understand mathematical concepts if I could relate them to practical or at least imaginable issues. But there was a certain level of abstraction that I could not go beyond, and consequently a whole range of mathematics that I could never grasp.

  34. We can blame it all (or at least a huge part) on axiomatic approach (which, one remembers, has all the advantages of theft over honest toil). Most of mathematical notions are there to help formalize some intuition, but after a field is formed it seems strange to say “well, you know, some matematicians played with this abstract concept and then that one and they saw some similarities so they’ve decided to give those similarities a name and see if they can find something interesting without knowing the details”. Sometimes, it helps to begin with examples, but again, it would be starnge to start a Wikipedia page with detailed examples. That said, I cannot figure out neither heads nor tails about K-theory and this is fine, why do I need to? When I read “the study of a ring generated by vector bundles over a topological space or scheme” and see that I have a shaky memory of what vector bundles are and no idea why they generate a ring, I know that maybe it’s a good idea to start with a few steps down.

  35. jack morava says

    For example, general relativity is about curvature R of space-time; curvature is a familiar tactile phenomenon that was made precise by Gauss using calculus. This step however requires understanding distance in (Euclidean) space, following ideas of Archimedes and (maybe) Riemann. Space and Time together are not quite the same as space-time, that’s special relativity; given that, GR says that R is (roughly) proportional to the amount of matter T (a.k.a energy = mc^2).

    Making sense of what T is was worked out in the case of light by Maxwell, and the proportionality constant is due to Newton. To work out the consequence of R ~ T involves rigid-body mechanics going back to eg Lagrange.

    That’s maybe a kind of Heart Sutra transcription; to make any real use of it you have to go back to the original language and do the work.

    @ Y, I know what you mean but cf for ex in biology…

  36. jack morava says

    K-theory is about making monoids (in which you can add or multiply things) into groups (where you can add or divide things): the steps from `natural numbers’ 1,2, etc to `integers’ such as -3, or from 2 to 1/2 in the case of `rational’ numbers, are examples. In linear algebra one can `add’ three-dimensional Euclidean space to one-dimensional time to get four-dimensional space-time (there are wrinkles involved in how you measure distance but…), whereas in the corresponding K-theory

    you can `subtract’ spaces, to define for example the space of directions perpendicular to the tangent plane of a surface in 3D. This kind of `group completion’ is subtle and tricky and can be used to turn a lot of geometry into something like double-entry bookkeeping… I think I should shut up now.

  37. @jack morava: Needless to say, that’s not how I would try to explain GR to a non-scientist.

  38. jack morava says

    @ David L,

    I guess I went through that rigamarole so as to say, Socrates wants us to define our terms. Mathematicians take such things seriously.

  39. Another one:

    A von Neumann algebra or W*-algebra is a *-algebra of bounded operators on a Hilbert space that is closed in the weak operator topology and contains the identity operator. It is a special type of C*-algebra.

    I knew someone who studied C*-algebras in grad school. I think he had fun.

  40. PlasticPaddy says

    @hat, Y
    The mathematicians want a ‘look and feel’ definition up front, with a current research section and maybe at the end the original problem for which the theory was developed. This is not the best presentation for the historian of maths or for the general reader. Would you prefer for Von Neumann algebra something like “a von Neumann algebra is a special kind of vector space of bounded Hilbert space operators (think of nxn matrices, which is the finite dimensional case). A n dimensional commutative V.N. algebra is just the diagonal matrices with n elements; if n is one this is the complex numbers. Von Neumann algebras are used as a tool to investigate questions in operator theory with applications in quantum mechanics, as well as in integral equations arising in other areas of theoretical physics or engineering. In pure mathematics, V.N. algebras are important for the classification of C* algebras, of which they are an important subclass.” A mathematician might find this superficial and uninformative.

  41. One unfortunate issue I have encountered many times with pure mathematicians is that once a area of study becomes sufficiently abstracted, a substantial number of people studying and working on the topic may only have a rather tenuous understanding of why the subject matter was considered interesting in the first place. For example, there has been a lot of interesting work on the underlying mathematics of gauge bundles. However, a lot of the development has unfortunately been completely orthogonal to what might be actually relevant to understanding interacting gauge field theories. Part of this is a natural reaction to the really interesting mathematical characterizations just being too hard, so people have tried exploring the implications of the gauge bundle structure in other ways. But this frequently loses sight of the fact that we know that the operational definitions we have are not the rigorously correct ones.

  42. Speaking for myself, I like to get an explanation, fuzzy and crude but as self-contained as possible, which I could stay focused on while learning its details. And I’d like to get an idea of why it’s an interesting and productive field of research. The latter is hard to do but it seems like hardly anyone tries to do so.
    Anyway, my point, which I think y’all would agree with me about, is that you don’t need formulas or LaTeX to make math opaque.

  43. Yes, further evidence of Slavo-Chadic.

    Fenno-Slavo-Chadic, I’d say.

  44. they show two apples and three apples and put them together to get five apples (vel sim.)

    Actually what they do, at least when and where my descendants were learning mathematics, is to have you put the 2 x 1 x 1 cm red block end to end with the 3 x 1 x 1 blue block and see that it is exactly the same size as the adjacent 5 x 1 x 1 yellow block (all three have black stripes every cm). With enough of these math manipulatives, as they are called, you can do anything a four-banger calculator can do (but with fewer digits).

    But I have concluded that all this obfuscation happens in math articles fundamentally because (as I have held for many years), mathematics is an art like painting or sculpture or literature or dance. Here is WP’s sub-article on assemblée, a technical term of ballet:

    Sometimes also pas assemblé. A jump that takes off from one foot and lands on two feet. When initiated with two feet on the ground (e.g. from 5th position) the working leg performs a battement glissé/dégagé, brushing out. The dancer launches into a jump, with the second foot then meeting the first foot before landing. A petit assemblé is when a dancer is standing on one foot with the other extended. The dancer then does a small jump to meet the first foot.

    I contend that this is only easier to understand than the math intros because we all understand foot and jump already.

  45. The dancer then does a small jump to meet the first foot.

    That made me giggle. Why, hello, first foot, it’s been a while, how are things at your end?

  46. January First-of-May says

    But I have concluded that all this obfuscation happens in math articles fundamentally because (as I have held for many years), mathematics is an art like painting or sculpture or literature or dance.

    While coaching my brother (currently in 9th grade) in geometry this year, already at least twice I found myself looking up geometry proofs on Russian Wikipedia and finding statements that are so blatantly and clearly wrong that even my brother noticed. (I wouldn’t be at all surprised if there are others that we hadn’t found yet.)

    The first time, the statement was added in 2016 by someone who probably didn’t quite understand the terminology they were using, and remained in the article for (at least) five years despite occasional added (and then later removed as apparent vandalism) comments to the effect of “this is absolute nonsense”.
    The second time, it was a proof, added in 2019, apparently as a direct translation of the proof used in the English Wikipedia article. Whoever added the proof apparently did not notice that the Russian article used a diagram with slightly different labeling of points, making the proof as given nonsensical. Over the next two years several editors fixed a few small bits of the descriptions to conform with the diagram as given, but that made the proof as a whole, if anything, even more nonsensical.

    I wonder how much of that had to do with the (likely) fact that the average reader of either article probably didn’t quite understand enough to even notice anything was wrong themselves either.

  47. John Cowan says

    The 2019 version, with the immortal combinatorial explosion “What’s a monad? Why, a monad is just a monoid in the category of endofunctors. What’s the problem?

Speak Your Mind