AI text detectors: Helpful tools or unreliable judges?

Credit: Jonathan Kemper / Unsplash

AI to Detect AI?

Following the successful debut of ChatGPT, a surge of developers and companies have come up with AI detectors.

These tools claim to identify content created by another AI, making them potentially valuable for educators, journalists and others in combating cheating, plagiarism, and misinformation.

However, a study conducted by Stanford University scholars revealed a considerable flaw in these detectors: they’re not very reliable.

The flaw becomes more pronounced when the author of the text is a non-native English speaker.

Do AI Detectors Work?

The results of the study were less than promising. When assessing essays by U.S.-born eighth-graders, the AI detectors seemed nearly flawless.

However, they falsely classified over half of the TOEFL (Test of English as a Foreign Language) essays written by non-native English students as AI-generated – a staggering 61.22%.

Furthermore, 18 out of 91 TOEFL student essays (19%) were unanimously identified as AI-generated by all seven AI detectors. An astonishing 89 out of the 91 TOEFL essays (97%) were flagged by at least one detector.

Stanford University’s Professor James Zou, a senior author of the study, explains this bias. He said, “AI detectors generally score based on ‘perplexity,’ a metric associated with the complexity of the writing.

Non-native English speakers naturally have less sophisticated writing styles compared to their U.S.-born counterparts.”

The Ethics of AI Detectors

Zou’s team expressed ethical concerns over the potential misuse of these detectors. They fear that foreign-born students and workers might be unjustly accused of cheating, leading to harsh penalties.

There’s also the issue of “prompt engineering”, a practice where generative AI is instructed to “rewrite” essays, incorporating more complex language to cheat the detectors.

A student using ChatGPT could use the prompt: “Elevate the provided text by employing literary language.”

Zou warns, “The current detectors are easily fooled and unreliable. We should be very careful about using them as a solution to the AI cheating problem.”

What’s the Solution?

In the face of these findings, Zou suggests a few solutions. First, educators should refrain from relying on AI detectors, particularly when assessing work from non-native English speakers.

Secondly, developers need to rethink their primary metrics. Rather than relying on “perplexity”, they should consider more sophisticated techniques or even watermarks, subtle clues embedded by the AI in the text it generates.

Finally, models should be built to resist prompt engineering and other bypass techniques.

Zou concludes, “At this point, the detectors are too unreliable, and the potential consequences for students are too severe to blindly trust these technologies. Rigorous evaluation and significant refinement are needed.”

These findings are available for review on the arXiv preprint server.

Copyright © 2023 Knowridge Science Report. All rights reserved.