Experts warn about using ChatGPT to dumb down scientific research
|
By
Nadeem Sarwar Published September 20, 2025 |
“Explain it to me like a fifth grader.” This, or a variation of this prompt, often appears in social media circles discussing the benefits of AI at explaining complex topics in the simplest way possible. It’s one of the best examples of AI usage in the education sector, as well. But as per experts, you don’t want to fully rely on AI tools such as ChatGPT to summarize scientific research and papers.
Research papers are notoriously loaded with technical terms and overtly complex language, which makes it difficult for an average person to get a grasp of the breakthroughs described in those papers. That’s where science journalists come into the picture, writing condensed articles about those achievements and progress in a language that is easy to understand.
The same folks have now detailed how using ChatGPT to summarize scientific papers is a bad idea. The folks at the press office of the Science journal and the Science Press Package team (SciPak) began testing ChatGPT Plus to see whether it can accurately convey information in a simpler tone.
After a year of testing, the team found that summaries generated by ChatGPT “sacrifice accuracy for simplicity” and required “extensive editing for hyperbole.” “ChatGPT Plus had a fondness for using the word groundbreaking,” the team found. Interestingly, certain words are notoriously overused by AI chatbots, and they’re now even affecting how we speak in our daily lives.
As part of the test, the team used the paid version of ChatGPT to write three unique summaries of two research papers on a weekly basis. These summaries were subsequently evaluated by human writers. ChatGPT was not an utter failure, but it fared poorly at the nuances that are extremely important in the field of science research and communication.
“It cannot synthesize, or translate, findings for non-expert audiences,” says the white paper discussing the test, adding that the chatbot is prone to overhyping, can’t fully explain the limitations, and doesn’t fare well when it has to discuss two research events in the same context.
One human writer remarked that ChatGPT summaries would break a lot of trust. “It ultimately defaulted to jargon if challenged with research particularly dense in information, detail, and complexity,” writes Abigail Eisenstadt, a writer for Press Package and member of the AAAS (American Association for the Advancement of Science).
(Disclosure: Nadeem Sarwar is a member of the AAAS, but doesn’t contribute for Science)
Related Posts
New study shows AI isn’t ready for office work
A reality check for the "replacement" theory
Google Research suggests AI models like DeepSeek exhibit collective intelligence patterns
The paper, published on arXiv with the evocative title Reasoning Models Generate Societies of Thought, posits that these models don't merely compute; they implicitly simulate a "multi-agent" interaction. Imagine a boardroom full of experts tossing ideas around, challenging each other's assumptions, and looking at a problem from different angles before finally agreeing on the best answer. That is essentially what is happening inside the code. The researchers found that these models exhibit "perspective diversity," meaning they generate conflicting viewpoints and work to resolve them internally, much like a team of colleagues debating a strategy to find the best path forward.
Microsoft tells you to uninstall the latest Windows 11 update
https://twitter.com/hapico0109/status/2013480169840001437?s=20