March 17, 2024: US-based startup Cognition has launched Devin, an AI-based tool that it claims is “the world’s first fully autonomous AI software engineer.”
Devin is designed to independently solve engineering tasks using its own shell, code editor, and web browser.
According to a demo provided by Cognition, Devin can connect to various APIs by leveraging a web browser to access and learn API documentation.
When an AI agent encounters an error, it automatically adds a debugging print statement to the main code within the code editor interface and reruns the code.
Cognition showcased Devin’s capabilities from building and deploying apps, identifying and fixing bugs in the codebase, and even fine-tuning AI models.
To evaluate Devin’s accuracy, Cognition tested its AI agent on SWE-bench, a benchmarking platform that requires agents to solve real-world problems found in open source projects on GitHub.
Devin successfully solved 13.86% of the problems, surpassing the performance of GPT4 (1.74%) and the previous best scores held by Anthropic’s Claude 2 (4.80%).
In particular, Devin achieved this without any help finding the relevant files within the repository.
Microsoft offers AI-based developer tools like GitHub Copilot that provide code completion and assistance for programmers, but they cannot complete end-to-end code without human intervention or assistance.
In contrast, Devin can complete coding tasks autonomously.
Today we’re excited to introduce you to Devin, our first AI software engineer.
Devin is on the new cutting edge of the SWE-Bench coding benchmark, has successfully passed practical engineering interviews at leading AI companies, and has also completed real work on Upwork.
Devin… pic.twitter.com/ladBicxEat
— Cognition (@cognition_labs) March 12, 2024
Cognition is currently offering early access to Devin for companies looking to leverage AI agents for their engineering tasks. Interested customers can request early access through the company’s website.
With its impressive performance on the SWE bench platform and the ability to operate independently, Devin represents a significant step forward in the development of AI-based software engineering solutions.