Software engineering is among the many fields being changed with the fast progress in large language models (LLMs). In a few years, LLMs have evolved from advanced code autocomplete tools to AI agents that can design software, implement and correct entire modules and help software engineers become more productive.
Like many other things surrounding LLMs, some of the excitement around AI-powered software engineering agents is unsubstantiated hype. But there is also true value to be captured, and developers who learn to use the new generation of AI tools will be able to do much more in less time.
AI coding assistants
There are three main ways that LLMs are changing the coding experience. First is the direct use of frontier models as assistants. Developers are using ChatGPT, Claude and other chatbot interfaces as coding assistants. The models are becoming increasingly good at generating code from text descriptions, improving a code snippet that you provide them, or helping you with debugging code.
Acknowledging the use case for software development, model providers are adding new features to enhance the developer experience in the chatbot interface. For example, Claude’s new Artifacts feature enables you to view and run the code as you iterate over it with the model.
More advanced uses of AI coding assistants are the LLMs that are added into integrated development environments (IDE) as plugins. These tools can use your project files and codebase as context to provide more accurate responses and accomplish more complex tasks.
Microsoft was the first company to enter the field with GitHub Copilot, launched a year before ChatGPT. It first began as a tool for writing code snippets inside your code editor. It has since evolved into a full assistant that can help you with various tasks in the development environment.
Amazon’s coding assistant Q provides similar features inside the coding environment, including code autocomplete, design agents, and migrating code across different programming languages.
A few startups have also entered the space, including Tabnine, which says it has millions of users and developers are using it to write 30% to 40% of their code. Other players include Replit, which provides a coding environment powered by its own LLM, and Codeium, an AI coding assistant that can integrate with dozens of IDEs.
Software engineering agents
The third way that LLMs are changing software development is through agentic frameworks. Basically, AI agents are multiple LLMs that are provided with different system prompts and are instructed to work together to complete a project. For example, one agent can be a designer who provides a high-level plan for completing a task, such as searching for resources that provide information, creating modules and then running them on a cloud platform. Another agent can provide a more detailed breakdown of each of those steps. A third agent can be assigned to write code for a specific task and send it to another agent that reviews the code for quality and sends it back for corrections. Finally, another agent can bring all the pieces together, compile them, test them and approve them for launch.
In theory, software engineering agents can receive a description of a project and complete it end-to-end. For example, in March, AI startup Cognition announced Devin, branded as “the first AI software engineer.” Devin uses LLM agents and multiple tools such as a browser, IDE, and compiler to gather resources, reason about the task, write code and evaluate the result. The user can follow the reasoning process and watch as Devin progresses through its work. Multiple demos posted by Cognition AI showed Devin completing different tasks, including an UpWork job for a computer vision project. This created the impression that AI agents might soon replace software engineers.
Devin is not open source and is still not open to the public. But it has inspired other projects, such as OpenDevin, an open-source software engineering agent with similar capabilities. And other software development agents such as GPT-engineer have been around for several months with impressive demos.
Hype or reality?
Multiple studies show that AI assistants such as GitHub Copilot increase the productivity of developers and help them stay focused on their tasks instead of searching around the web for solutions to their problems. ChatGPT and Claude have also become regular tools for developers to draft software design ideas, prepare initial versions of code, and learn new coding skills.
However, some of the excitement and hype around AI software development assistants is unwarranted and has attracted the attention of seasoned engineers. For example, multiple videos show the canned demonstrations of Devin are not what they have been marketed as, and AI agents are far from performing the complete set of tasks of a mid-level or senior software engineer.
There are also concerns that tools such as Copilot can produce unsafe code that might have turned up in their training data or the user’s code base. The providers of the tools are constantly working to add safeguards that prevent the models from generating insecure code. There is also the risk of “automation blindness,” where developers become too accustomed to accepting the code generated by the AI without reviewing it. This can result in unpredictable code that then takes additional time to debug.
What’s for sure is that AI is nowhere near replacing software developers. However we are still in the early stages of AI coding assistants, and there is no denying that there is much value in using LLMs in software development. As AI enters more domains, demand for software developers is also increasing. As the tools and models mature, we can expect more productivity gains in software engineering.
The upcoming VB Transform 2024 conference will further explore these themes with expert panels discussing the cross-functional future of AI, featuring leaders. We hope to see you there!
The post How AI Agents are changing software development appeared first on Venture Beat.