AI voices are getting too real — what this means for us

Credit: Unsplash+.

For years, people could easily tell the difference between a computer-generated voice and a real human one.

Think of Siri or Alexa — useful, yes, but not exactly natural. That may no longer be the case.

New research from Queen Mary University of London shows that artificial intelligence (AI) has reached the point where it can create voice “clones” — audio recordings that sound just as real as human voices.

The study, published in PLOS One, reveals just how far this technology has come, and what it might mean for society.

The researchers compared recordings of real people with two types of AI-generated speech.

Some of the synthetic voices were created to mimic a specific person, while others were produced by a general voice model without copying anyone in particular.

Volunteers were then asked to listen and judge which voices sounded realistic, as well as which seemed more dominant or trustworthy.

The results were striking. Listeners often could not tell the difference between cloned AI voices and real human speech.

While the study did not find that AI voices were judged “more real” than human ones (a phenomenon known as “hyperrealism”), they were rated as equally convincing. Interestingly, many of the AI voices were also judged to sound more dominant — and sometimes even more trustworthy — than the human ones.

Dr. Nadine Lavan, a senior lecturer in psychology who co-led the study, explained how easy it was for the team to make these voice clones.

With the consent of the original speakers, they needed only a few minutes of recordings, some freely available software, and very little technical knowledge or money. “It just shows how accessible and sophisticated AI voice technology has become,” she said.

That rapid progress comes with both opportunities and risks. On the positive side, natural-sounding AI voices could improve accessibility tools, create personalized educational resources, or help people with speech difficulties communicate more easily.

On the darker side, realistic “deepfake” voices could be used for fraud, impersonation, or spreading misinformation.

As Dr. Lavan noted, “AI-generated voices are all around us now. Our study shows that the technology has reached a new stage — and we urgently need to understand how people perceive and respond to these voices.”

The question is no longer whether AI can talk like us. It’s whether we’re ready for what happens now that it can.