Vegetative Electron Microscopy.

Retraction Watch reports on a spectacular find:

The phrase was so strange it would have stood out even to a non-scientist. Yet “vegetative electron microscopy” had already made it past reviewers and editors at several journals when a Russian chemist and scientific sleuth noticed the odd wording in a now-retracted paper in Springer Nature’s Environmental Science and Pollution Research.

The ludicrous phrase is what sleuths call a “fingerprint”: an offbeat characteristic found in one or more publications that suggests paper-mill involvement. Today, a Google Scholar search turns up nearly two dozen articles that refer to “vegetative electron microscopy” or “vegetative electron microscope,” including a paper from 2024 whose senior author is an editor at Elsevier, Retraction Watch has learned. The publisher told us it was “content” with the wording.

Searching for such clues is just one way to identify the hundreds of thousands of fake papers analysts say are polluting the scientific literature, as we reported in an investigation published last month in The Conversation. And the tale of “vegetative electron microscopy” shows how nonsense phrases can enter the vocabulary of researchers and proliferate in the literature.

After spotting the term, the Russian chemist, who goes by the pseudonym “Paralabrax clathratus” on PubPeer, left a comment about it in November 2022 on the online forum. He also mentioned the finding to fellow fraud buster Alexander Magazinov, a software engineer in Kazakhstan, who ran a Google Scholar search on the term and got several hits, some of which he flagged on PubPeer. Most of the articles included authors from Iran.

In one of his PubPeer comments, Magazinov speculated the phrase could have originated through faulty digital processing of a two-column article from 1959 in which the word “vegetative” appeared in the left column directly opposite “electron microscopy” in the right (the paper shows up in a Google search for “vegetative electron microscopy”). Perhaps an AI model had picked it up and spit it back into machine-generated text that was since plagiarized in other papers by the same Iranian network of fraudsters, Magazinov elaborated in an interview.

You can see the image of the 1959 article at the link, where of course there are further details (and links in the quoted text above — I only carried one over). The best part:

Meanwhile, Elsevier is defending the use of the odd wording, which appeared in a paper from February 2024 in the journal Industrial Crops and Products. The work, “Agricultural wastes: A practical and potential source for the isolation and preparation of cellulose and application in agriculture and different industries,” purportedly used “vegetative electron microscopy” to study the structure of bacterial cellulose derived from date syrup.

As Magazinov said, “So, we are learning that bacterial cellulose is a kind of ‘vegetative structure’. They are taking a piss without even pulling their pants down, aren’t they?”

Comments

  1. David Eddyshaw says

    Reputational damage is not a concern for Elsevier unless it is so severe that it might possibly affect profits. They are a tapeworm in the gut of academia.

  2. I can see the paper in question here. The sentence is “Date syrup (as one of the agricultural wastes) was used to produce bacterial cellulose using Gluconastobacter xylinus. Fourier transform infrared spectroscopy (FTIR), vegetative electron microscopy, and X-ray diffraction were used to determine the structure of bacterial cellulose, cellulose fibers, and crystallinity of the samples (Moosavi and Yousefi, 2011).” So the authors aren’t claiming to have used any kind of electron microscopy. They or their APE are reviewing a lot of research on cellulose derived from agricultural waste.

    The abstract of the Moosavi and Yousefi paper says they used “low quality date syrup, a rich and available source of nutrient in Iran”.

    “…the structure of bacterial cellulose, cellulose fibers, and…” looks odd too.

  3. Yes, Sabine Hossenfelder has been banging on [junk phrases around 6:00] about this for several years; relatedly last week science funding is solely to keep researchers off the streets, allegedly. She’s particularly scathing of Elsevier, Wiley.

    Another of her hobby-horses is the pointlessness of the increasingly energy-demanding research programme at CERN/LHC. It does strike me that if humanity is succeeding in melting the icecaps, and Europe is continuing to spend more on Russian oil than on arms for Ukraine, shutting down CERN would be an easy cost-saving.

    ” … They are taking a piss without even pulling their pants down, …”

    The English idiom is ‘taking the piss’, which has become rather shorn of its literal sense. But ‘taking the piss without pulling their pants down’ doesn’t work either (for me).

  4. J.W. Brewer says

    It’s a delightful phrase. The widely-alleged wickedness of the publisher aside, as more and more academic-paper “content” is generated by academics (whether fraudulent or otherwise) in ESL by academics who are lack native-speaker-level fluency in English but have professional motivations to publish in English, and as more and more of their drafts are peer-reviewed (if at all …) and edited (if at all …) by others who likewise lack native-speaker-level fluency, one should predictably expect a decline in instinctual “okay that sounds totally bonkers, can it really be accepted jargon in the relevant field?” reactions to peculiar phrasing.

    This is a systematic problem without an obvious solution other than, you know, stricter editorial gatekeeping or a wider range of languages for scholarly publishing. One partial solution might be for there to be funding via which e.g. hypothetical non-fraudulent Iranian academics with understandably limited fluency in English (but an understandable desire not to publish solely in Farsi) can obtain access to editorial assistance from people with better English skills even if without specialized research skills. When my now-wife was a penniless grad student in English literature, she made a little money on the side from a U.S. university with a fair number of foreign-origin LEP faculty (not the one where she was studying) to do copy-editing on their ESLish draft scholarly papers to try to make them less ESLish before the papers were submitted to journals for consideration. She didn’t know anything substantive about the technical fields in which the articles were written by those LEP faculty but it was still felt that she could add value. But it may have been relevant that since this was a U.S. university even their most ESLish faculty were generally trying to get published in US-based journals not completely run by people with English-fluency deficits of their own.

  5. But ‘taking the piss without pulling their pants down’ doesn’t work either (for me).

    I, on the other hand, love it either way. Vigorous self-expression trumps idiomatic naturalness every time!

  6. Some disciplines and their journals don’t care too much anyway, as long as the contents are good. I’m thinking of the IEEE and its journals, which publish a lot of research from E very much SL speakers.

  7. J.W. Brewer says

    @Y: disciplines presumably vary in how important the prose is: maybe some of the IEEE stuff you basically can get the gist of by reading the equations and the tables/graphs and the prose in between is not very relevant to understanding the point of the article. (My maternal grandfather, who had gotten his academic training as an electrical engineer back in the 1920’s when transistors hadn’t been invented and even vacuum tubes were a weird experimental thing not yet in common use, regularly read IEEE publications as a very elderly man in the 1980’s and 90’s, claiming he wanted to stay current, although I don’t know that anyone ever quizzed him to assess how well he understood the cutting-edge stuff.)

  8. David Eddyshaw says

    I don’t think “vegetative electron microscopy” can really be explained as an ESLism.

    These things arise from mechanical substitution of (supposed) synonyms to evade automated plagiarism checkers. It’s a kind of autoreplace error, really.

  9. Q. Pheevr mentions “vegetative electron microscopy”, apropos of having similarly found “shake my booty” in an 1863 English edition of Don Quixote.

  10. David Eddyshaw says

    Interesting to speculate on how the original plagiarised document must have read. “Low-energy electron microscopy”, perhaps?

  11. I don’t think “vegetative electron microscopy” can really be explained as an ESLism.

    Exactly. This isn’t about ESLisms, this is about complacently accepting meaningless bullshit as sensible and normal. (I leave the extrapolation beyond scholarly publications as an exercise for the reader.)

  12. J.W. Brewer says

    I may need a more nuanced vocabulary. I do not suggest that VEM is an ESLism in terms of being a string an LEP writer is particularly likely to inadvertently produce. It’s rather something an LEP reader is (so my hypothesis runs) less likely than a native-speaker-fluency reader to immediately notice as jarringly “off.” This may depend in part on a related hypothesis (which I think is plausible but am not necessarily in a position to prove with evidence) that even baffling-sounding professional jargon used by some niche subsets of L1 Anglophones tends to follow certain English-based patterns of cromulence such that an Anglophone of native-speaker fluency who doesn’t know the jargon very well may still have the subconscious capacity to register peculiar deviations from the expected patterns. Of course, in the example at hand, it’s probably not so much an abstract patterning thing as knowing that the semantics of “vegetative” and the semantics of “electron microscopy” are unlikely to combine harmoniously.

    And here it sounds from one of the comments upthread that the problem in the article at hand may have been in uncritically taking the nonsense-phrase over from a prior published article without stopping to notice that it made no sense. Which again is a problem (if one experiments with the charitable assumption of lack of conscious fraudulent intent) with the reading fluency of these latest Iranian authors rather than their writing fluency.

  13. David Eddyshaw says

    I don’t think unfamiliarity with the nuances of English can really explain (let alone justify) citing a plagiarised paper. Moreover, any actual knowledge of electron microscopy is incompatible with actually spontaneously creating the expression “vegetative electron microscopy” no matter how poor one’s English. How could that even happen?

    The point about this phrase is that it demonstrates plagiarism, not bad English. And plagiarism combined with a systematic and conscious attempt to conceal the plagiarism.

    In fact, the phrase is perfectly good English …

  14. David Eddyshaw says

    On the question of citing plagiarised papers:

    Judging by the spam I get from academia.edu, papers I have published (not on academia.edu, where I have never uploaded anything medical) have been cited in a vast range of fields of which I know nothing whatever and to which I have never contributed anything of the least value.* Evidently the quaint convention that one should at least have glanced at the summary of a paper one cites as a reference is not being strictly adhered to …

    I think that JF’s suggestion that “AI” may have been significantly involved in the offending paper, particularly in the literature review, is a plausible one. If so, this would not be colluding in plagiarism, but merely a manifestation of the junk status of the paper itself.

    * However, one does wonder about the sophistication of their algorithm. I also get regular enquiries about whether I am the “David Eddyshaw” mentioned in the publications of a certain “David Eddyshaw” … I suppose I really should reply to these, confirming that I am not …

  15. J.W. Brewer says

    The phrase is “perfectly good English” in a specifically Chomskyan way (“syntactically well-formed” but semantically incoherent a la colorless green ideas) that I should have thought David E. eschewed. There are a bunch of overlapping phenomena here, with scholarly slovenliness-or-worse covering one cluster and poor language skills covering another. I don’t want to deny the relevance of the former but I don’t think the latter is irrelevant. It is ceteris paribus just *harder* to have a well-functioning-in-context bullshit detector when the bullshit you’re theoretically trying to detect is popping up intermittently in texts in a language you don’t fully understand to a native-speaker level of fluency. If we uncharitably-yet-not-implausibly assume the actual authors of the piece to be consciously malevolent actors not merely hapless foreigners with malfunctioning bullshit detectors, that just shifts the focus back to the limited-English-fluency peer reviewers or journal editors whose bullshit detectors apparently did not successfully beep in response to the nonsensical phrase.

    Again, this is not necessarily any specific individual’s *fault* beyond consciously malevolent actors. It’s just a systemic consequence of making English the dominant language of international scholarly discourse without the level of English fluency of all of the relevant international scholarly folks who are heavily incentivized to participate in that discourse being (quite understandably!) quite as high-level as would be optimal. As the proverb has it:
    don’t hate the player, hate the game.

  16. David Eddyshaw says

    I see what you mean, and unenthusiastically concede your point with ill grace. Some muttering darkly into my beard may ensue.

  17. Did people look at the Retraction Watch post? The source of “vegetative electron microscopy” is right there. It did not arise from an attempt to paraphrase or anything linguistic. It just came about from an AI scraping a two-column document incorrectly.

  18. Exactly.

  19. Likewise “shake my booty”. And not necessarily AI. VEM came from a book from the 1950s, which was probably digitized a while back, using old-style OCR.

  20. regular enquiries about whether I am the “David Eddyshaw” mentioned in the publications of …

    Curiously, although I get plenty of enquiries asking whether I am the < elided > mentioned in the publications of Uncle Tom Cobbley and all, never am I mentioned by this “David Eddyshaw” geezer, prolific though he apparently is. I’m impressed by the wide diversity of fields < elided > is contributing to, not remotely related to Linguisticianism. For all I know, < elided > might have single-handedly invented the field of “vegetative electron microscopy”.

  21. David Eddyshaw says

    Clearly we underestimate the seminal character of our work.

  22. DE: vast range of fields of which I know nothing whatever and to which I have never contributed anything of the least value

    There are no such fields. Don’t deny people the light of your knowledge.

    Using bad English as international language of science is nothing new. If Cicero travelled to Middle Ages he would be surprised at the Latin they used. I understand that people with EFL are not all long dead like in the case of classical Latin and have a right to complain. Do complain!

    My modest exposure to scientific literature suggests that “prose” is used as much to obscure and confuse as to illuminate. In terms of understanding “what the author wants to say” badness of English is neither here nor there.

    They are taking a piss without even pulling their pants down, aren’t they?
    I don’t know what an EFL person makes out of it, but for Russian it is a description of a terminal state of drunkenness, which really doesn’t fit at all. Mixed idioms? A nice touch in describing the origin of “vegetative … electron microscopy”.

  23. I think that JF’s suggestion that “AI” may have been significantly involved in the offending paper, particularly in the literature review, is a plausible one.

    The paper that I read was entirely a literature review, as far as I could tell by skimming.

    My phrase “the paper in question” in my previous comment was misleading. Two papers with “vegetative electron microscopy” are mentioned in the Hat’s post: one, now retracted, in Springer Nature’s Environmental Science and Pollution Research (I plagiarized those last eight words from the Retraction Watch excerpt), and one, still defended, in Elsevier’s Industrial Crops and Products. The second one is the one I read and linked to.

  24. Further down in the RetractionWatch thread is a suggestion that the problem comes from a Persian-to-English translation error, as “scanning electron microscopy” and “vegetative electron microscopy” differ in only one character.

  25. J.W. Brewer says

    David W.’s point is interesting (although it would be a remarkable coincidence considering the other plausible causal pathway already referenced by Brett) but of course that’s exactly the sort of easy-to-commit translation mishap that could have and should have been caught before publication if the translation had been reviewed by someone with an appropriate level of fluency in English rather than someone who couldn’t accurately assess whether the English words being spit out by translation software actually made any sense in context.

  26. David Marjanović says

    Evidently the quaint convention that one should at least have glanced at the summary of a paper one cites as a reference is not being strictly adhered to …

    Oh no. Oh fuck no. As a reviewer I routinely find papers cited for things they don’t say, sometimes even the opposite of what they say, and in most cases bad writing in the cited papers cannot be blamed.

    Perhaps relatedly, I routinely get manuscripts in very distinctive ESL from sets of authors who include at least one person who is known to write much better English than that, often a native speaker. Evidently the quaint convention that one should at least have glanced at the final draft of a manuscript one’s name is on before the first/corresponding author submits the manuscript is not being strictly adhered to either.

    These two issues occur in combination: recently I got a manuscript with ten authors that showed unfamiliarity with some of the third and fourth (or so) authors’ work.

  27. David Eddyshaw says

    Eh, I often forget my own past work. It is better so … every day a new beginning, I say!

  28. Don’t they say that 90% of everything is shit? The reason LLMs are so attractive is that they don’t compete with Shakespeare or Einstein but with the shoddy work mass-produced by humans who have to churn out texts for their jobs and don’t know or care about quality.

  29. David Eddyshaw says

    APEs save us from the labour of producing our own crap manually. A good point.

    Perhaps, as the UK government is even now teetering on the brink of making gigascale automated plagiarism Perfectly Legal and Respectable, I shall retire the hurtful primatist acronym APE in favour of the snappier “autocrapper.”

  30. In this specific case, I think the mistranslation explanation is more plausible than the LLM one. For the general case, I would suggest that maybe it’s a bad idea to create, or even to allow, jobs that consist of churning out shoddy texts whose content even their own writers don’t care about. And when the texts in question are presenting themselves as research publications, Gresham’s Law seems more relevant than Sturgeon’s…

  31. I believe the original descriptor in Sturgeon’s Law was “crud,” which works better than a comparison to poop. The ninety percent is bad, but very little of it is aggressively noisome.

  32. Wikipedia acknowledges that the “law” exists in variant texts, sometimes “crud” and sometimes “crap” and thinks “crap” is the canonical form while acknowledging that the earliest printed references have “crud.” This would make sense if you thought Sturgeon coined it as “crap” but bowdlerized that to “crud” given the standards of propriety prevailing in the 1950’s. “Crap” even as a once-tabooed word is less strong than “shit” and I think more easily used in a metaphorical sense without evoking literal feces.

  33. In this specific case, I think the mistranslation explanation is more plausible than the LLM one.

    Why not the two reinforcing each other? For instance, maybe the one-character error would have been caught by software or wetware if there hadn’t been an OCR error suggesting the phrase was valid.

  34. I don’t think that’s how OCR works: it can be constrained by a dictionary of the language, and by near-neighbor probabilities, but running together “vegetative”-“electron microscopy” was a one-off, not a common collocation.

    The mistranslation theory is supported, I think, by the presence of the acronym SEM in many of the papers that Google Scholar finds with “vegetative electron microscope/microscopy”. In fact some papers have phrases like “the vegetative electron microscope (SEM) (PHILIPS model, XL-30)”. Maybe the authors know the English acronym without knowing the English words that it stands for.

    Both “vegetative” and “electron microscopy” are so common that it’s not surprising that they happened to line up across columns at least once (or enough times that OCR flubbed the columns at least once). I think this could be a coincidence. Of course that doesn’t excuse Elsevier’s stupid defense of the phrase. In the “Agricultural wastes” paper (the one that’s a literature review), it’s used where “scanning electron microscopy” would be correct, summarizing a paper that does actually use it.

  35. Here’s the kind of thing I had in mind. Someone makes an one-character error in Farsi when writing about SEM. A carbon-based sentient being is translating it and comes up with “vegetative electron microscopy”. Can that be right? They find the phrase at GB, and being overworked and underpaid, they don’t look at the text to see that it was a two-column error. So they put “vegetative electron microscopy” into their translation.

    Or an LLM does the same thing because it has that two-column error, or an error made by a human translator, in its large language.

    I’m not trying to prove that the two-column error was responsible, but I think it could easily be involved.

  36. In any case, the salient point is, as ktschwarz said, Elsevier’s defense of the phrase. However the error arose, that’s grounds for a verdict of malicious stupidity.

  37. I can’t believe a human, however bad their English, would ever produce “vegetative electron microscopy”, because a human would notice that it doesn’t match the acronym, which is probably already in the paper and in any case familiar in any language—the Persian Wikipedia article uses the English acronym, as do Wikipedias in Chinese, Vietnamese, Turkish, etc. But not noticing that kind of discrepancy is common for machine translators.

    It also seems unlikely that an LLM would do it, because they’re biased toward regurgitating common phrases, not one-offs.

    But once it’s there, I can see how an inattentive human editor could overlook it. If you know about SEM and your eyes run over “blah-blah electron microscopy (SEM)” while skimming, you’re likely to fill in the blah-blah unconsciously, or just ignore it.

  38. I can imagine a human coming up with “vegetative”, especially if the English abbreviation isn’t in the original. You may not have my deeply personal experience of errors in translation and other things, when something is staring me in the face.

  39. This makes the infamous “stricken mass distribution” positively benign by comparison.

Speak Your Mind

*