In January, a team of researchers at Carnegie Mellon University published a study analyzing the use of artificial intelligence technologies that can generate their own computer code.
The study found that while these increasingly popular A.I. systems could speed up software development, they could also degrade the quality of computer code, which can slow projects over time. In other words, they may generate buggy code.
Now a new wave of Silicon Valley start-ups is trying to solve that problem.
These start-ups, including Axiom Math and Harmonic, both in Palo Alto, Calif., and Logical Intelligence in San Francisco, hope to create A.I. systems that can automatically verify computer code in much the same way that mathematicians prove elaborate math problems.
“Code verification is probably the next frontier,” said Carina Hong, chief executive and founder of Axiom.
Axiom said on Thursday that it had raised $200 million in new funding from venture capital firms like Menlo Ventures, Greycroft and Madrona. With headquarters next door to the downtown Palo Alto offices where Mark Zuckerberg built Facebook, the start-up is a year old, employs about 20 people but is now valued at $1.6 billion.
Venture capitalists are betting big on this new kind of company because they see it as a path toward improving the code generated by A.I. systems like OpenAI’s Codex or Anthropic’s Claude Code.
“Right now, the biggest problem with using A.I. to write code is that you don’t know when the code contains a bug,” said Matt Kraning, a partner with Menlo Ventures. “There are early signs that technology like Axiom’s can help with this.”
Like its rival Harmonic — which is valued at $1.45 billion after its latest funding round in the fall — Axiom began as an effort to create technology that solves math problems. In December, its technology, called AxiomProver, achieved a perfect score on the Putnam Exam, an annual competition that tests the math skills of top college students.
The A.I. systems that drive chatbots like ChatGPT often make mistakes. Sometimes, they even make stuff up. But when the subject is mathematics, technologies like AxiomProver can eliminate those mistakes.
Axiom has built technology that can formally prove whether an answer is right or wrong. It does this using a computer programming language called Lean, which was created more than decade ago as a way of proving mathematical statements.
Lean was originally a tool for mathematicians. Now, systems like AxiomProver are using Lean to prove math problems.
The hope is that these systems can use the same technique to verify the quality of computer code.
Although Axiom’s technology learned its skills by analyzing proofs of math problems, the company recently said it had achieved high scores on a standard benchmark test that judges whether A.I. systems can verify computer code. A.I. researchers called this “transfer learning” — when a system learns one skill (like proving math problem) and can successfully transfer that skill to a different task (like verifying computer code).
As they begin to train their systems for code verification, Ms. Hong and her colleagues said, they can further improve the quality of A.I.-generated code.
But experts warn that these methods have limits.
In mathematics, there is a clear distinction between correct and incorrect. But with computer programming, the distinction is harder to pin down, especially as programmers build things like social media services, which must handle the chaos created by millions of users across the globe.
“You can’t always specify what it means for computer code to be correct,” said Bogdan Vasilescu, a Carnegie Mellon computer science professor. “There are places where A.I. can verify code. But that does not mean that all problems in the code will suddenly go away.”
Cade Metz is a Times reporter who writes about artificial intelligence, driverless cars, robotics, virtual reality and other emerging areas of technology.
The post A.I. Writes Buggy Code. A Silicon Valley Start-Up Wants to Fix It. appeared first on New York Times.




