AI Wrote 89% of Their Code. You Still Need Engineers.

AI Wrote 89% of Their Code. You Still Need Engineers.
On 27 May 2026, Cognition raised $1 billion at a $26 billion valuation and announced its AI agent Devin now writes 89% of its own code, up from 13% in December 2025. Founders are reading that as permission to fire their engineers. It isn't. That 89% is written by an agent built and supervised by some of the best engineers alive. Remove the engineers and you don't get Devin. You get debt.
So, look. The number is real and it's genuinely impressive. Cognition's annualised run-rate revenue hit $492 million, enterprise usage of Devin grew more than 10x in 2026, and its clients include Goldman Sachs, Citi, Mercedes-Benz, Dell, and the US Navy (TechCrunch, Cognition). I've spent 15 years building and shipping software, and I don't roll my eyes at this. Devin is a serious tool.
What I roll my eyes at is the conclusion non-technical founders are drawing from it. I've had three separate conversations this month that boiled down to the same line: "If an AI can write 89% of a billion-dollar company's code, why am I still paying developers?"
Because the headline and the reality are two different things. Let me show you the gap.
What "89% written by Devin" actually means
The 89% figure measures code committed by Cognition's own engineers that was generated by Devin. It does not mean 89% of code shipped with no human involved. It means a team of elite engineers, the people who built the agent in the first place, are pointing it at well-defined problems, reviewing what comes back, and committing the parts that pass.
That distinction is everything. Cognition's engineers know exactly what good output looks like because they designed the system that produces it. They have the judgement to spot the 11% that's wrong, and the 11% in a payments or auth flow is the part that ends your company.
Anthropic's 2026 Agentic Coding Trends Report put a name to this. Developers now use AI in roughly 60% of their work, but report being able to fully delegate only 0-20% of tasks. They call it the "delegation gap." AI is a constant collaborator that still needs setup, prompting, supervision, validation, and human judgement, especially for high-stakes work. Cognition has closed that gap further than almost anyone on earth because their humans are exceptional. Yours probably aren't building frontier coding agents on the side.
What the data says about teams that aren't Cognition
Step away from the frontier labs and the picture changes fast. In 2026, 84% of developers use AI tools and those tools write around 41% of all code (Index.dev). Sounds like a revolution. Then you look at the output: at 92%+ adoption, measured organisational productivity gains sit at roughly 10% (ShiftMag). Huge input, modest output.
The sharpest evidence is from METR, who ran a randomised controlled trial, the same method used for drug trials, on 16 experienced open-source developers across 246 real tasks from their own repos. The developers thought AI made them 20% faster. They were actually 19% slower with it. They were wrong about their own productivity by nearly 40 percentage points.
Worth being straight about this: in February 2026 METR flagged that their cohort had a selection problem, because the developers who benefit most from AI wouldn't sit through no-AI sessions even at $50/hour (METR update). So treat the exact 19% with caution. But the core finding holds across the wider data: developers consistently feel faster than they measurably are. GPS makes driving feel easier while making you worse at navigation. AI coding feels effortless while the clock says otherwise.
The bill nobody puts in the headline: security
This is the part that actually keeps me up, because it's the part that surfaces in a technical due diligence two years later, right when you're trying to raise or sell.
Veracode tested over 100 LLMs across 80 coding tasks and found 45% of AI-generated code introduced an OWASP Top 10 vulnerability. CodeRabbit's analysis found AI-generated code contains 2.74x more security vulnerabilities than human-written code. And developer trust in AI output has collapsed to somewhere between 29% and 46% depending on the survey. The people closest to the tools trust them least, and they're right to.
Devin's 89% works because Cognition's engineers catch the dangerous fraction before it ships. The vibe-coded startup with no senior oversight doesn't catch it. It compounds. Then you pay me to find it, which is a service I'd genuinely rather you didn't need (it's literally called Vibe-Code Fixes, and I'd be happy to never run it again).
The real shift: engineers become orchestrators, not optional
The Cognition story isn't "engineers are obsolete." It's "the engineer's job moved." Anthropic's report describes the role shifting from implementer to orchestrator, where the value is in system design, agent coordination, quality evaluation, and breaking problems down so an agent can actually solve them. About 27% of AI-assisted work is now stuff that wouldn't have been built at all otherwise, so AI is expanding scope, not just speeding up the same backlog.
Here's the trap in that. Orchestrating well is a more senior skill, not a less senior one. Deciding what to delegate, spotting the wrong 11%, designing the architecture an agent runs inside, reviewing at volume without rubber-stamping: that's exactly what a good CTO or senior engineer does. If you cut those people, you don't get Cognition's 89%. You get a junior pointing an agent at a codebase nobody understands.
| Factor | Cognition (the headline) | A typical seed-stage startup |
|---|---|---|
| Who supervises the AI | Engineers who built the agent | One stretched founder or a junior |
| Ability to spot bad output | World-class, by design | Limited; that's why they hired help |
| Codebase | Built to be agent-friendly | Whatever got shipped to hit the demo |
| Review process | Rigorous, automated + human | Often "it runs, ship it" |
| Cost of the wrong 11% | Caught before commit | Found in tech DD, 2 years later |
| Right takeaway | AI scales great engineering | AI scales whatever you already are |
That last row is the whole article. AI is a multiplier. It multiplies good engineering judgement and it multiplies the lack of it. The 89% is a story about Cognition's engineers, not about replacing yours.
How to adopt AI coding agents without becoming a cautionary tale
I'm not anti-AI. Metamindz built and shipped MintyAI, a bookkeeping product with matching algorithms and autonomous workflows, in two weeks against a four-to-five month traditional estimate, using exactly these tools. The difference was structure. Here's what actually works, concretely:
1. Decide what AI is allowed to touch. We define a hard line: agents draft features, tests, and boilerplate freely; they do not get the final say on auth, payments, data modelling, or anything handling personal data without a senior review. Write it down. Make it a rule, not a vibe.
2. Put a real gate before main. Automated checks (SAST, dependency scanning, type checks) plus a human review on every AI-heavy pull request. If 41% of your code is machine-written, your review process is now your product, not an afterthought.
3. Keep a senior in the loop, even fractionally. You don't need a full-time CTO at $180k+ to do this. You need senior judgement on the architecture and the risky 11%. That's the entire reason fractional CTO work exists: 4-20 hours a month of someone who knows what good looks like.
4. Train the team to orchestrate, not just prompt. The 5x gap between developers who get real value from AI and those who don't is about workflow, not the tool. Our AI adoption work is hands-on for this exact reason: engineers learn by doing it on their own stack, with guardrails.
| Aspect | Bolting AI on (the typical way) | CTO-led AI adoption (Metamindz) |
|---|---|---|
| Starting point | "Buy everyone Cursor, go faster" | Maturity assessment, then a workflow per stack |
| What AI can touch | Everything, unsupervised | Defined tiers; auth/payments need sign-off |
| Code review | Optional, often skipped | Automated gates + human review on AI-heavy PRs |
| Senior oversight | None, or one stretched founder | Fractional CTO on the architecture and risk |
| Security posture | 45% chance of OWASP issues, unchecked | Caught before it ships |
| Outcome at fundraise | Tech DD finds the debt | Codebase that passes tech DD |
| Honesty | Vendor sells you more seats | We tell you when you don't need us |
If you're a non-technical founder and your dev shop or in-house team is leaning hard on AI, the question to ask isn't "are you using AI?" Everyone is. It's "who reviews what the AI writes, and what happens to the risky parts?" If the answer is a shrug, that's your problem, and it's fixable now for a fraction of what it costs in two years.
Frequently Asked Questions
Can AI replace software engineers in 2026?
No. AI writes a large share of code, including 89% at Cognition, but only under expert supervision. Anthropic's 2026 data shows developers can fully delegate just 0-20% of tasks. AI moves the engineer's job from writing code to orchestrating, reviewing, and owning the risky parts. That role is more senior, not optional.
What does it mean that Devin writes 89% of Cognition's code?
It means 89% of the code Cognition's engineers commit was generated by their AI agent Devin, then reviewed and approved by them. It is not 89% of code shipped with no humans. The engineers who built the agent catch its mistakes, which is exactly the capability most startups lack.
Is AI-generated code safe to ship without review?
No. Veracode found 45% of AI-generated code introduced an OWASP Top 10 vulnerability, and CodeRabbit measured 2.74x more security flaws than in human-written code. AI-generated code needs automated security gates plus human review, with senior sign-off on authentication, payments, and any personal data handling before it reaches production.
Should a startup fire its developers because of AI?
No. AI multiplies engineering judgement, so cutting the people who provide that judgement multiplies your mistakes instead. A leaner team augmented by AI can work, but only with senior oversight on architecture and risk. A fractional CTO at 4-20 hours a month is a cheaper way to keep that judgement than losing it entirely.
How do you adopt AI coding tools safely?
Define what AI is allowed to touch, put automated and human review gates before your main branch, keep a senior engineer or fractional CTO accountable for architecture and the risky 11%, and train the team to orchestrate agents rather than just prompt them. Structure is what separates a 2-week build from a 2-year cleanup.
The bottom line
Cognition's 89% is true, and it's a great advert for what AI does in expert hands. It is a terrible argument for firing your engineers, because the whole reason the number works is the engineers. The companies that win this aren't the ones that replace human judgement with agents. They're the ones that put their best judgement around the agents and let them rip inside the guardrails.
If you want a straight read on how your team is using AI and where the risk actually sits, that's a free, no-obligation CTO conversation, NDA on request. Book a discovery call and I'll tell you honestly whether you've got a problem worth fixing. Sometimes you don't, and I'll say so.