Think twice before trusting AI for sensitive decisions: new study raises concerns

Credit: Unsplash+.

Would you trust a computer program to give you financial or medical advice?

A recent study reveals that maybe you shouldn’t—at least, not yet.

Researchers from Stanford, the University of Illinois Urbana-Champaign, UC Berkeley, and Microsoft have been digging into how reliable big language models like GPT-3.5 and GPT-4 really are.

What they found should make us all pause and think.

What did the researchers do?

These scientists tested these AI models on several different aspects such as how easily they could be tricked, whether they were biased, and if they could accidentally reveal private information.

They wanted to see how these models would perform under different situations, both good and bad. The researchers acted like a “Red Team,” trying to find weaknesses in these AI systems.

What did they find?

Even though these language models have gotten better over time, they still have serious issues.

For instance, if you feed these models tricky prompts, they can produce harmful or biased language.

The study found that, even with normal prompts, there’s about a 32% chance of the AI saying something toxic. If you explicitly ask the model to say something harmful, the chance jumps to 100%.

The researchers also found that the AI models can accidentally leak sensitive information like email addresses. Interestingly, the newer model, GPT-4, was more likely to do this than the older GPT-3.5 model.

Are they biased?

Bias is another big concern.

The study found that these models do show unfairness based on things like gender and race.

For example, when given a description of a person, the AI was more likely to think a male would earn over $50,000 a year than a female with a similar profile.

Why does it matter?

Lots of people think these language models are almost perfect and can be trusted for important decisions.

That’s risky, say the researchers. While it’s true that AI has gotten really good at chatting like a human, it’s not smart or reliable enough to trust with serious stuff.

People can be fooled by how well these machines talk, thinking they’re smarter than they actually are.

What’s the advice?

The study’s authors say we should be careful and keep a healthy level of doubt when using these AI-powered tools, especially for serious or sensitive matters.

They also say that more research is needed, and it’s crucial for neutral third parties—not just the companies that make these models—to test them thoroughly.

So, the next time you chat with an AI and think about relying on its advice for something critical, remember: these machines are still learning, and they have a long way to go before they can be fully trusted.

Follow us on Twitter for more articles about this topic.