Companies like Anthropic and OpenAI continue to sharpen the coding skills of their artificial intelligence technologies, with no ceiling in sight.
On Thursday, Anthropic unveiled a new flagship model, called Claude Opus 4.8, that is significantly better than its predecessor, Claude Opus 4.7, at generating computer code, according to independent benchmark tests.
It is particularly good at vibe coding, which is when A.I. technologies create software in response to prompts written in conversational English.
The new model outperforms all other publicly available A.I. technologies on the vibe coding benchmark from Vals AI, a company that tracks the performance of the latest A.I. technologies. Opus 4.8 scored 10 percent higher on this benchmark test than the previous Anthropic model, said Rayan Krishnan, Vals AI’s chief executive.
The new model also outperformed its predecessor in mathematics, another area where the leading A.I. technologies continue to improve by leaps and bounds.
In a blog post on Thursday, Anthropic said Opus 4.8 topped its predecessor when operating as an A.I. agent, which is when A.I. technologies use other software programs to automate various tasks.
Last year, Anthropic and OpenAI released technologies that took an unexpected leap in their ability to write code. This accelerated the development of new software and suddenly unleashed a new wave of A.I. agents.
As A.I. systems have improved at writing computer code, they have also become better at identifying security vulnerabilities in software — a skill that is fundamentally changing cybersecurity.
Last month, Anthropic unveiled a new technology called Claude Mythos and said the technology was too dangerous to be released publicly because it could be used to identify security vulnerabilities in the computer software that underpins the internet. It shared Mythos with a small group of tech companies and organizations.
OpenAI unveiled similar technology soon after and took a different tack, sharing the technology with a much larger group and cybersecurity experts. It also upgraded its consumer chatbot with the technology.
Anthropic still has not released Mythos publicly, but has indicated it will eventually share it with a wider group of companies and government agencies. Mythos is essentially a more powerful version of Opus 4.8, the technology that Anthropic released on Thursday.
(The New York Times has sued OpenAI and its partner, Microsoft, accusing them of copyright infringement of news content related to A.I. systems. OpenAI and Microsoft have denied those claims.)
Cade Metz is a Times reporter who writes about artificial intelligence, driverless cars, robotics, virtual reality and other emerging areas of technology.
The post Anthropic Unveils New, More Powerful Claude Model appeared first on New York Times.




