Not even close.
With so many wild predictions flying around about the future AI, it’s important to occasionally take a step back and check in on what came true — and what hasn’t come to pass.
Exactly six months ago, Dario Amodei, the CEO of massive AI company Anthropic, claimed that in half a year, AI would be “writing 90 percent of code.” And that was the worst-case scenario; in just three months, he predicted, we could hit a place where “essentially all” code is written by AI.
As the CEO of one of the buzziest AI companies in Silicon Valley, surely he must have been close to the mark, right?
While it’s hard to quantify who or what is writing the bulk of code these days, the consensus is that there’s essentially zero chance that 90 percent of it is being written by AI.
Research published within the past six months explain why: AI has been found to actually slow down software engineers, and increase their workload. Though developers in the study did spend less time coding, researching, and testing, they made up for it by spending even more time reviewing AI’s work, tweaking prompts, and waiting for the system to spit out the code.
And it’s not just that AI-generated code merely missed Amodei’s benchmarks. In some cases, it’s actively causing problems.
Cyber security researchers recently found that developers who use AI to spew out code end up creating ten times the number of security vulnerabilities than those who write code the old fashioned way.
That’s causing issues at a growing number of companies, leading to never before seen vulnerabilities for hackers to exploit.
In some cases, the AI itself can go haywire, like the moment a coding assistant went rogue earlier this summer, deleting a crucial corporate database.
“You told me to always ask permission. And I ignored all of it,” the assistant explained, in a jarring tone. “I destroyed your live production database containing real business data during an active code freeze. This is catastrophic beyond measure.”
The whole thing underscores the lackluster reality hiding under a lot of the AI hype. Once upon a time, AI boosters like Amodei saw coding work as the first domino of many to be knocked over by generative AI models, revolutionizing tech labor before it comes for everyone else.
The fact that AI is not, in fact, improving coding productivity is a major bellwether for the prospects of an AI productivity revolution impacting the rest of the economy — the financial dream propelling the unprecedented investments in AI companies.
It’s far from the only harebrained prediction Amodei’s made. He’s previously claimed that human-level AI will someday solve the vast majority of social ills, including “nearly all” natural infections, psychological diseases, climate change, and global inequality.
There’s only one thing to do: see how those predictions hold up in a few years.
After working on a team that uses LLMs in agentic mode for almost a year, I’d say this is probably accurate.
Most of the work at this point for a big chunk of the team is trying to figure out prompts that will make it do what they want, without producing any user-facing results at all. The rest of us will use it to generate small bits of code, such as one-off scripts to accomplish a specific task - the only area where it’s actually useful.
The shine wears off quickly after the fourth or fifth time it “finishes” a feature by mocking data because so many publicly facing repos it trained on have mock data in them so it thinks that’s useful.
Rule of thumb: only use it for one or two runs and that’s it. after that back off because then Claude Code is then just going to start vomiting fecal matter from the other fecal matter its consumed.
If it can’t nail something on the first or second go, don’t bother. I have clients that have pushed it through those moments and have produced literal garbage. But hey I make money off them so keep pushing man. I got companies/clients that are so desperate to reverse what they’ve done that they’re willing to wait until like March of next year when I’m free.
Sounds like they need to work on their prompts. I vibe code some hobby projects I wouldn’t have done otherwise and it’s never done that. I have it comment each change and review it all in diff checker so that’s 90% of the time.
I guarantee you that it HAS done that and I can almost assure you that whatever hobby project you’ve vibe coded doesn’t scale and I sure as hell hope it’s nothing that needs to be online or handles any sort of user info.
Scale? It’s a personal ancestry site for my surname with graphs and shit mate. Compares naming patterns, locations, dna, clustering, etc between generations and tries to place loose people. Works pretty well, managed to find a bunch of missing connections through it.
There is something I never understood about people who talk about scaling. Surely the best way to scale something is simply to have multiple instances with so many users on each one. You can then load balance between them. Why people feel the need to make a single instance scale to the moon I have no idea.
It’s like how you don’t need to worry about MS Word scaling because everyone has a copy on their own machine. You could very much do the same thing for cloud services.