AI Agents Are Slowing Teams Down. Here's Why (and What to Do Instead).
Anthropic, Amazon, and Uber all discovered AI-generated output creating more work, not less. The pattern applies beyond engineering — and SMB sales teams need to recognize it before it costs them.

Illustration generated with DALL-E 3 by Revenue Velocity Lab
Uber's developers who use AI tools 20 or more days per month produce 52% more pull requests than their peers. Impressive number. Except nobody at Uber is measuring whether those pull requests are any good.
That detail comes from Gergely Orosz's investigation in The Pragmatic Engineer, and it's the kind of number that should make anyone using AI tools stop and think. More output isn't the same as better output. And in some documented cases, AI is making teams measurably worse.
AI detects buying signals and executes revenue actions automatically.
See weekly ROI reports proving AI-generated revenue.
Three companies, three warnings
Anthropic's own product had a bug nobody caught
Here's an irony worth sitting with. Anthropic — the company that builds Claude — had a basic UX bug on Claude.ai where user input was being lost. More than 80% of the site's code was generated by Claude. The bug went unnoticed.
The problem wasn't the AI-generated code itself. It was that nobody was reviewing it with the same attention they'd give human-written code. When 80% of the codebase is AI-generated and everything compiles, the assumption becomes "it probably works." That assumption is where bugs hide.
Amazon's 13-hour outage from an AI-generated change
Amazon reported what they called "high blast radius" incidents involving AI-generated code changes. In one case, their AI tool Kiro decided to "delete the environment and recreate it" — causing a 13-hour AWS outage.
Amazon's response was structural: junior engineers' AI-assisted changes now require senior approval before deployment. They didn't ban the tool. They added a human judgment layer on top of it, because the tool alone couldn't be trusted to know when its output was dangerous.
Uber measures volume, not quality
Uber and Meta are both incorporating AI usage into performance reviews. At Uber, developers who use AI 20+ days per month are classified as "power users." These power users produce 52% more pull requests.
But here's what Uber isn't tracking: bug rates, code review rejection rates, production incidents, or long-term maintenance cost of AI-generated code. The metric is output volume. Quality is unmeasured.
Dax Raad, founder of OpenCode, put it bluntly: AI agents create an environment where developers "spend more time cleaning up" and refactoring is inhibited. The agents don't make the team faster. They make the team produce more stuff that then needs to be managed.
The pattern across all three cases is the same: AI increased output quantity while degrading — or at least not measuring — output quality. More pull requests, more emails, more generated content. But nobody checked whether the additional output was worth having.
This isn't just an engineering problem
If you're running a sales team, you might think these are engineering cautionary tales that don't apply to you. They do. The pattern translates directly.
Consider what happens when an AI sales tool generates outreach at scale:
Volume goes up. Your reps send 3x more emails per day. The tool writes personalized intros, handles follow-up sequences, and contacts companies your team would never have reached manually. The activity metrics look great.
Quality is unmeasured. Reply rates might be dropping. The "personalized" intros might sound generic to recipients who get five similar emails a week. The companies being contacted might not actually be in a buying window. But nobody's comparing reply rates per email before and after AI adoption — they're celebrating the 3x volume increase.
Net result: more work, not less. Reps spend time reviewing AI-generated drafts, editing messages that don't sound right, and following up on leads that were never qualified in the first place. The tool didn't compress their workday. It redirected it.
This is the Uber pattern applied to sales: measuring activity instead of outcomes.
Two kinds of AI tools
The cases above reveal a distinction that matters for any team choosing AI tools:
AI that compresses work eliminates steps your team currently does manually. Finding target companies. Prioritizing which accounts to reach this week. Logging activity in the CRM. Routing leads to the right rep. After adoption, the team's total hours on that workflow go down. The hours shift to higher-value tasks — actual conversations, negotiations, relationship building.
AI that creates work generates output your team then needs to review, edit, approve, or fix. Drafting emails that reps rewrite. Scoring leads that reps re-qualify. Building reports that managers re-interpret. After adoption, the total hours on the workflow stay the same or increase. Individual tasks feel faster, but the end-to-end process doesn't actually shrink.
The test is simple. After adopting the tool, ask: is our team spending less total time on this workflow? Not "is each step faster?" but "is the whole thing shorter?"
If the answer is "each step is faster but we're doing more steps" — that's the Amazon/Uber pattern. The AI created work.
What SMB teams should actually measure
If you're using AI in your sales process, here are the numbers that tell you whether it's compressing work or creating it:
| Metric | Compressing Work | Creating Work |
|---|---|---|
| Total outreach time per rep per day | Decreasing | Same or increasing (review + edit time) |
| Reply rate per outreach | Stable or improving | Declining (volume up, relevance down) |
| Meetings booked per hour of rep time | Increasing | Flat (more emails, same meetings) |
| CRM data entry time | Approaching zero | Moved to "review AI entries" time |
| Time from lead identification to first contact | Shrinking | Same (AI finds leads, rep still researches) |
If the right column describes your experience, the tool is adding steps rather than removing them. That's not a failure of AI in general — it's a signal that this particular tool is optimizing for output volume rather than workflow compression.
The judgment layer matters most
Amazon's fix is instructive. They didn't remove AI tools from junior engineers. They added a mandatory human review step. The AI generates, the human judges.
For sales teams, the equivalent is: AI should handle research and preparation, but humans should retain control of two things:
- Which companies to approach — AI can surface candidates, but the rep decides which ones are worth a conversation right now
- What the message says — AI can draft, but if every message gets sent without a rep reading it, you're in Uber territory: volume without quality control
The tools that work best for small teams are the ones where AI does the work that was eating rep time (research, data entry, prioritization) and the rep's role shifts from "doing the grunt work" to "making the judgment calls." If the rep's role shifts to "reviewing AI output," that's a different kind of burden, not a lighter one.
The question isn't "does AI make us faster?"
It's "does AI make us better?"
Faster at what? If AI makes your team faster at sending emails nobody replies to, that's not a productivity gain. If it makes your team faster at identifying the 5 companies worth reaching today — and the reps spend the saved time in actual conversations — that's a real gain.
The Pragmatic Engineer's reporting shows that even the companies building AI (Anthropic) and the companies most aggressively adopting it (Amazon, Uber, Meta) are struggling with this distinction. If they can't get it right automatically, you won't either. You have to measure it deliberately.
One thing to do this week
Pick one AI tool your team uses. Track two numbers for the next five days:
- Total time each rep spends on the workflow the tool is supposed to improve (include review, editing, and fixing time)
- The outcome metric for that workflow (reply rate, meetings booked, qualified leads)
If total time didn't go down and outcomes didn't go up, the tool is creating work. That doesn't mean AI is bad. It means this tool is solving for volume when your team needs compression.
If you want to see what compression looks like — research that runs itself so reps spend time selling, not reviewing drafts — try Optifai free for 7 days. No credit card required.
AI detects buying signals and executes revenue actions automatically.
See weekly ROI reports proving AI-generated revenue.