ChatGPT vs Human: How to spot AI-generated science writing

ChatGPT vs Human. Credit: Heather Desaire and Romana Jarosova/University of Kansas.

A groundbreaking study published in the journal Cell Reports Physical Science has shed light on a remarkable tool that can distinguish between human-written and AI-generated academic science writing with over 99% accuracy.

This tool, developed by researchers led by Professor Heather Desaire from the University of Kansas, aims to address the growing influence of AI chatbots, such as ChatGPT, in producing text that resembles human language.

Professor Desaire emphasizes that their goal was to create a user-friendly method, accessible even to high school students, enabling them to identify AI-generated writing across different genres.

This exciting development empowers individuals from diverse backgrounds to contribute to the field without requiring a computer science degree.

One of the major challenges posed by AI writing is its tendency to assemble text from various sources without verifying accuracy—a bit like the game “Two Truths and a Lie.”

To tackle this issue, the research team focused on a specific type of article called perspectives, which provide overviews of research topics.

They selected 64 perspectives and generated 128 articles on the same subjects using ChatGPT. Through their analysis, the team discovered a key characteristic of AI writing—predictability.

Humans, in contrast to AI, exhibit more intricate paragraph structures, with varying numbers of sentences, total word counts, and sentence lengths.

Additionally, certain preferences in punctuation and vocabulary usage serve as telltale signs.

For instance, scientists tend to use words like “however,” “but,” and “although,” while ChatGPT often employs terms like “others” and “researchers.” The researchers identified a total of 20 distinct features that the model could use to detect AI-generated text.

The results were remarkable. The model accurately identified AI-generated full perspective articles 100% of the time.

When focusing on individual paragraphs within the articles, the accuracy rate remained high at 92%. Furthermore, the team’s model outperformed existing AI text detectors by a significant margin in similar tests.

Excitingly, the researchers plan to expand the model’s applicability by testing it on larger datasets and different types of academic science writing.

They are eager to explore how well their tool withstands the advancements and increased sophistication of AI chatbots in the future.

While the model developed by Professor Desaire and her team excels at distinguishing between AI and human scientists’ writing, it was not designed to detect AI-generated student essays. However, she emphasizes that their methods can be easily replicated by others for their specific purposes.

This study marks a significant step forward in the ongoing battle to differentiate between AI and human-generated content.

With this newfound tool, scientists and educators can better navigate the world of AI-generated text, understanding its potential limitations and ensuring the integrity of academic writing.