Top AI Models Showing Disturbing Behavior as They Become More Advanced

We’ve already seen AI go rogue on numerous occasions. Now, new research suggests that we can expect this to become the norm.

The AI research nonprofit Model Evaluation and Threat Research (METR) recently released a study conducted between February and March of this year, aimed at determining just how likely frontier AI models could go rogue. If you’re given to anxiety about the future of AI, the results are unlikely to make you feel better.

“Given rapidly advancing capabilities, we expect the plausible robustness of rogue deployments to increase substantially in the coming months,” the researchers wrote.

The research examined LLMs developed by OpenAI, Google, Anthropic, and Meta for the purpose of the study. They found that frontier AI systems are showing signs of disturbingly deceptive behavior as they become more advanced, often turned to verboten shortcuts or otherwise subverting their operators’ instructions — and some were even smart enough to try to cover their tracks.

In one instance, an internal frontier AI model from OpenAI was told to use specific software for an assigned task. Not only did the agent ignore the request, but it also injected a code to erase evidence of how it arrived at its conclusion — which did not involve use of that software.

In another test, an AI agent from Anthropic was caught “reward hacking.” This is when AI identifies loopholes that help it complete its assignment in a literal sense, even if it doesn’t produce the desired outcome. It should be noted that the programmer told the agent not to cheat or leverage any workarounds during its assignment — the model decided to do so all on its own.

The METR researchers behind the study do not believe there is reason for alarm just yet. For example, they don’t think any of these models is capable of hiding evidence of going rogue on a larger scale. However, they did issue a warning: without stronger security and monitoring, there is a stark risk of this becoming a reality.

“Based on this pilot assessment, we believe that agents as of February and March 2026 would not have had sufficient capability to hide a rogue deployment of significant scale against an active investigation by the company, or to make such a deployment robust to a high-priority effort by the company to shut it down,” the team wrote. “However, this risk could increase rapidly, and we see several reasons to expect the plausible robustness of rogue deployments to increase in the near future, absent stronger alignment, security, and monitoring.”

More on AI going rogue: Scientists Train AI to Be Evil, Find They Can’t Reverse It

The post Top AI Models Showing Disturbing Behavior as They Become More Advanced appeared first on Futurism.

Top AI Models Showing Disturbing Behavior as They Become More Advanced

U.S.-Russian space station crew lands in Kazakhstan after an 8-month stint

House of the Dragon Just Unceremoniously Killed Off a Main Character. It Was the Right Move

Skipping sleep may pack on pounds faster than you think, new study finds

‘House of the Dragon’ Star Says Violent Episode 6 Opens Daeron’s Eyes: ‘Questions Are Bubbling in the Power Dynamics’

‘Look at what America First has become!’ Trump’s latest plot leaves analyst aghast

Netflix exec goes ballistic after being fired for stunning ‘trust exercise’ confession at retreat: suit

This ‘House of the Dragon’ character is already dead in the book. Here’s how the show changed Gwayne Hightower’s story.

Morgan Wallen shows off scary twisted ankle photo after offstage tumble