Logo

saturday, may 24, 2025

How AI Resurrected a Fake Science Term

Blog Cover Image
TechPublished On: April 21, 2025
Shivam Tripathi

Author

Shivam Tripathi

A 1959 scanning glitch birthed the fake term “vegetative electron microscopy.” Now, AI keeps spreading it across science, learn why it’s nearly impossible to kill.

The Phantom Phrase That Fooled Science: AI’s Role in Spreading a Nonsense Term

In an era where AI promises to revolutionize research, it's also unexpectedly spreading junk science. Case in point? The bizarre term “vegetative electron microscopy”, a phrase that sounds legitimate but means absolutely nothing.

Yet, this nonsense has made its way into academic papers, research databases, and even peer-reviewed journals. So how did something entirely made up become embedded in the world of science? Let’s dig into the strange story of how a scanning error morphed into a digital fossil and how AI helped give it eternal life.

What Is “Vegetative Electron Microscopy”?

Let’s be clear: there is no such thing.

“Vegetative electron microscopy” is a fabricated phrase. It’s a combination of two unrelated terms that somehow got stitched together during the digitization of a 1950s-era scientific paper. It has no scientific foundation, yet you’ll find it in dozens of research papers and even indexed in databases like Google Scholar.

It sounds like something you’d expect in biology or nanoscience, but the term is pure gibberish, created not by a scientist, but by a software glitch.

Where Did It Come From?

The mistake traces back to a 1959 article published in Bacteriological Reviews. During the digitization process, scanning software misread columns in the original journal layout. It mistakenly combined the word “vegetative” from one line with “electron microscopy” from another, two unrelated segments on a page.

When digitized text is used as training data for AI models, errors like these can sneak in. These digital glitches, often called “tortured phrases”, become part of what AI learns as fact.

This particular tortured phrase is now known as a digital fossil, a type of persistent misinformation baked into machine learning models through flawed training data, as The Conversation recently reported.

How AI Helped Fossilize the Error

Once in the wild, this phantom phrase was picked up by AI language models trained on massive internet datasets like CommonCrawl, a public collection of web pages used by companies like OpenAI and Anthropic.

According to researchers, newer models, GPT-4o, Claude 3.5, and even some mid-level academic summarization tools, tend to autocomplete scientific phrases using “vegetative electron microscopy” when prompted with older biology text.

Ironically, older models like GPT-2 or BERT did not make this mistake. This suggests that somewhere between 2018 and 2023, the corrupted training data became widespread enough to infect newer AI generations.

So why didn’t anyone catch it?

Because it sounds legit.

To non-experts and even some peer reviewers, “vegetative electron microscopy” sounds like cutting-edge bioscience. That false familiarity is part of how it snuck into real scientific publications, especially from non-English-speaking countries, where translation quirks made it more likely to go unnoticed.

The Human Error That Opened the Door

Interestingly, part of the glitch's comeback can be blamed on a small linguistic nuance. In Persian script, the words for “vegetative” and “scanning” differ by just a single dot. That likely caused an incorrect Farsi-to-English translation, reintroducing the term in Iranian research papers.

From there, it spread like wildfire through AI tools that scrape and summarize academic content.

As Retraction Watch pointed out, once a term appears multiple times in published works, AI treats it as credible, even if it’s complete nonsense. Worse still, some journals like Elsevier initially tried to defend the term before issuing corrections, further muddying the waters.

Why This Matters for the Future of Science

This bizarre error is more than just a cautionary tale, it’s a warning sign for the future of scientific credibility.

AI’s ability to amplify digital fossils like this has serious implications:

  • Misleading Researchers: Scientists using AI tools to summarize studies could unknowingly cite incorrect terms.
  • Polluting Academic Databases: Junk phrases that enter systems like PubMed or Scopus may remain undetected for years.
  • Weakening Peer Review: Journals relying on automated tools for review might fail to flag such nonsense.
  • Fueling Misinformation: General public AI tools (like chatbots or academic assistants) could misinform curious readers or students.

Can We Fix It?

Here’s the hard truth: not easily.

Cleaning up CommonCrawl, one of the largest training datasets, is next to impossible for outsiders. It contains petabytes of scraped web data. Even large tech firms struggle to audit and sanitize it effectively.

Moreover, most AI companies, including OpenAI and Anthropic, do not publicly share which datasets they use or how they clean them. This lack of transparency makes it difficult for researchers to trace or fix mistakes once they’ve been embedded.

Journals too are under pressure. The rush to publish, fueled by quantity-based metrics, means many publishers are willing to overlook suspicious phrasing unless someone raises the alarm. In one infamous 2024 case, Frontiers had to retract a paper with nonsensical AI-generated images of rat anatomy and made-up chemical pathways.

A Growing “Junk Science” Crisis

This isn’t an isolated case. The Harvard Kennedy School’s Misinformation Review recently flagged a rise in AI-generated “junk science” floating around platforms like Google Scholar.

These are articles or phrases that appear scientific but are factually void, often generated or amplified by AI tools that can’t distinguish between valid content and noise.

Without stricter editorial controls and improved transparency in AI training data, the risk of misinformation becoming mainstream is rising rapidly.

Final Thoughts

The saga of “vegetative electron microscopy” shows how easily errors can creep into our collective scientific consciousness and how AI, while revolutionary, can sometimes act as a misinformation super-spreader. A small scanning glitch from a 1959 biology paper now lives on in the world’s most advanced language models.

For researchers, this story underscores the need for critical thinking, editorial vigilance, and transparency in AI development.

The future of science should be built on solid facts, not tortured phrases and digital fossils.

Got a hot finance tip or insider scoop? Share it with our editorial team at [email protected] – we’d love to hear from you.