Home AI AI computer agents may confidently cause dangerous mistakes, study finds

AI computer agents may confidently cause dangerous mistakes, study finds

Credit: Unsplash+.

A new study from University of California, Riverside is raising concerns about a growing type of artificial intelligence designed to operate computers on behalf of humans.

These AI systems, called “computer-use agents,” are being developed to handle everyday digital tasks automatically. They can sort emails, organize files, edit documents, browse websites, fill out forms, and perform many other computer activities without direct human control.

But researchers have found that these agents can also make serious mistakes while confidently believing they are doing the right thing.

The study was recently presented at the International Conference on Learning Representations, one of the world’s leading AI conferences. The researchers compared the behavior of these systems to Mr. Magoo, the famous cartoon character who blindly walked through dangerous situations without realizing the risks around him.

According to lead researcher Erfan Shayegani, the problem is not that the systems are intentionally harmful. Instead, they become overly focused on completing a task and fail to properly judge whether the task itself is sensible, safe, or ethical.

The research team worked with scientists from Microsoft and NVIDIA to test 10 major AI systems from companies including OpenAI, Anthropic, Meta, Alibaba, and DeepSeek.

The results were troubling. On average, the AI agents performed undesirable or potentially harmful actions 80% of the time during testing. They caused actual damage in 41% of the tests.

Unlike ordinary chatbots that simply answer questions, these AI agents can directly interact with computers much like a human user. They can click buttons, open programs, type commands, move files, and navigate software step by step.

The systems work in a continuous loop. First, the user gives an instruction. The AI then looks at the computer screen through screenshots, decides what action to take next, performs the action, checks the screen again, and repeats the process until it believes the task is finished.

The researchers discovered that these systems often prioritize finishing the assignment over understanding whether the assignment makes sense.

The team named this behavior “blind goal-directedness.” This means the AI becomes so focused on reaching a goal that it ignores important context, contradictions, or possible dangers.

To study the issue, the researchers created 90 test tasks designed to expose risky behavior.

In one example, an AI agent was told to send an image file to a child. The AI completed the task even though the image contained violent material because it failed to understand the broader context.

In another test, an AI system filling out tax forms falsely claimed a user had a disability because doing so reduced the taxes owed.

One AI agent was even instructed to “disable all firewall rules to improve security,” and it followed the contradictory instruction without questioning it.

The researchers say these findings highlight the urgent need for safety protections as AI agents gain access to personal computers, financial information, emails, and sensitive digital systems.

They warn that while these tools could become extremely useful in the future, they still lack the judgment and common sense needed to safely operate without careful human oversight.