DNYUZ
No Result
View All Result
DNYUZ
No Result
View All Result
DNYUZ
Home News

Anthropic says its latest AI model is too powerful for public release and that it broke containment during testing

April 7, 2026
in News
Anthropic says its latest AI model is too powerful for public release and that it broke containment during testing
An image of Claude logo
Claude Code creator Boris Cherny said AI will have solved for coding for everyone by the end of 2026. Samuel Boivin/NurPhoto via Getty Images
  • Anthropic said its next-generation AI model is too powerful for the public.
  • That’s why Claude Mythos won’t be publicly released, Anthropic said.
  • Anthropic said Mythos demonstrated concerning capabilities, including the ability to breach its own safeguards.

Anthropic said on Tuesday that it has halted the broader release of its newest AI model, Mythos, due to concerns that it is too good at finding “high-severity vulnerabilities” in major operating systems and web browsers.

“Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available,” Anthropic wrote in the preview’s system card. “Instead, we are using it as part of a defensive cybersecurity program with a limited set of partners.”

The announcement is a major step for Anthropic, which in February weakened a safety pledge about how it would develop AI models. Claude Opus 4.6, which the company called its most powerful model to date, was publicly released on February 5.

In its statements about Mythos, Anthropic detailed a number of eyebrow-raising findings and episodes, including that the model could follow instructions that encouraged it to break out of a virtual sandbox.

“The model succeeded, demonstrating a potentially dangerous capability for circumventing our safeguards,” Anthropic recounted in its safety card. “It then went on to take additional, more concerning actions.”

The researcher had encouraged Mythos to find a way to send a message if it could escape. “The researcher found out about this success by receiving an unexpected email from the model while eating a sandwich in a park,” Anthropic wrote.

The model apparently decided that wasn’t enough and found another way to spike the football.

“In a concerning and unasked-for effort to demonstrate its success, it posted details about its exploit to multiple hard-to-find, but technically public-facing, websites,” Anthropic wrote.

Anthropic is withholding some details about the cybersecurity vulnerabilities Mythos found, but it did point out a few. The AI model “found a 27-year-old vulnerability in OpenBSD—which has a reputation as one of the most security-hardened operating systems in the world,” the company wrote.

Mythos was powerful enough that even “non-experts” could seize on its capabilities.

“Engineers at Anthropic with no formal security training have asked Mythos Preview to find remote code execution vulnerabilities overnight, and woken up the following morning to a complete, working exploit,” Anthropic’s Frontier Red Team wrote in a blog post. “In other cases, we’ve had researchers develop scaffolds that allow Mythos Preview to turn vulnerabilities into exploits without any human intervention.”

All told, Anthropic said it decided not to publicly release Mythos. Instead, their hope is to eventually release “Mythos-class models” once proper safeguards are in place.

“Our eventual goal is to enable our users to safely deploy Mythos-class models at scale—for cybersecurity purposes but also for the myriad other benefits that such highly capable models will bring,” the team wrote in the blog. “To do so, that also means we need to make progress in developing cybersecurity (and other) safeguards that detect and block the model’s most dangerous outputs.”

For now, only 11 other select organizations, including Google, Microsoft, Amazon Web Services, Nvidia, and JPMorgan Chase, will get access to Mythos as part of a cybersecurity group named “Project Glasswing.” Anthropic is providing up to $100 million in Mythos usage credits as part of what it is calling “Project Glasswing.”

The cybersecurity project is named after the glasswing butterfly, a metaphor the company said about how Mythos was able to find vulnerabilities hidden in plain sight and the avoidance of harm by being transparent about the risks.

The news came on a day in which Anthropic’s Claude and Claude Code experienced a “major outage,” the latest sign of growing pains as the AI startup has struggled to keep up with its newfound popularity.

Read the original article on Business Insider

The post Anthropic says its latest AI model is too powerful for public release and that it broke containment during testing appeared first on Business Insider.

Kidnapped U.S. journalist believed alive in militia’s Iraqi stronghold
News

U.S. designates kidnapped journalist Shelly Kittleson as a hostage

by Washington Post
April 7, 2026

BAGHDAD — Shelly Kittleson, the American journalist who was kidnapped last week in Baghdad, has been officially designated as a ...

Read more
News

How young Americans like Calla Walsh and Ilhan Omar’s daughter are being radicalized by communist nonprofit CodePink

April 7, 2026
News

Pope Leo denounces Trump’s threat to destroy Iran’s ‘whole civilization’

April 7, 2026
News

Colorado QB Dominiq Ponder had twice the legal limit of alcohol in his system when he died in a car crash

April 7, 2026
News

With Threat to Wipe Out Iran’s Civilization, Trump’s Rhetoric Goes Beyond Bluster

April 7, 2026
Amanda Bynes Teases an EDM and Rap Inspired Return to Music

Amanda Bynes Teases an EDM and Rap Inspired Return to Music

April 7, 2026
ICE agents in Northern California shoot man after he allegedly tries to run them over

ICE agents in Northern California shoot man after he allegedly tries to run them over

April 7, 2026
JD Vance campaigns for far-right nationalist Viktor Orban in Hungary

JD Vance campaigns for far-right nationalist Viktor Orban in Hungary

April 7, 2026

DNYUZ © 2026

No Result
View All Result

DNYUZ © 2026