In June, I gave a lecture to a company about coding agents. At the end, someone from the team told me something I won't forget: "The history of this company is split in two — until today, and starting from today."
I thought they were being dramatic. Turns out they weren't.
Four months later, this team at Sensos — roughly 15 developers led by CTO Yair Ripshtos — has mastered Cline and Claude Sonnet. They've passed the learning curve. They're shipping fast enough that it's changed the team dynamics. The acceleration from coding agents created unexpected throughput — the kind of problem you want to have.
The Haiku Question
When I posted about Haiku 4.5 a few weeks ago, Yair reached out. His team was running Cline through GitHub Copilot's API, using Claude Sonnet for most tasks. Last month's bill showed base allocation for 15 seats, plus overage charges that nearly doubled that base cost.
Not catastrophic for a 15-person team, but noticeable. And if Haiku really delivered Sonnet 4-level quality at one-third the cost and 2-3x the speed, the math suddenly got interesting.
He asked his team to try it.
The Real Test
This wasn't a controlled experiment. No benchmarks, no SWE-bench scores, no carefully designed tasks. Just real work — the kind these developers do every day.
Yair started with their heaviest user, the developer who burns through the most tokens each month. If anyone would notice a quality drop, it would be him.
First impression: "Significantly faster than 4.5."
A few days later: "I like it. I'm trying to use it for most tasks when I want to use AI. Loving how fast it is. Makes it easier to stay on the task and not go work on something else while the AI works."
That speed feedback kept coming up. Not just from the power user — from everyone who tried it. Yair and several other developers all tested Haiku over the course of a week. The verdict was unanimous: it's good enough for what we need, and it's really, really fast.
All of them noticed it. All of them got addicted to it.
Speed As a Feature
The speed isn't just about saving time. It's about staying in flow.
When Sonnet takes long enough that you start checking Slack or switching tabs, you lose context. You fragment your attention. When Haiku responds fast enough that you stay engaged with the agent, you think better. You plan better. You iterate faster.
That's what "2-3x faster" actually means in practice. It's not about the raw API latency. It's about whether the tool keeps you focused or pushes you away.
The Economics Work
Here's the part that matters for larger organizations: the economics actually pencil out.
If this team adopts Haiku for most of their work, the overage disappears entirely. Their bill could drop by roughly 45%, with room to spare. That's cutting costs in half while keeping developers happier.
At scale, those numbers get bigger fast. For teams with 50, 100, 200 developers running coding agents, Haiku isn't just a marginal improvement — it's a budget line that becomes sustainable.
And if something doesn't work? They can always fall back to Sonnet. The models coexist. You don't have to commit.
What This Actually Proves
This team didn't run a benchmark. They ran a production test with real developers doing real work. And Haiku passed.
That's the signal that matters. Not because Haiku is "better" than Sonnet, but because it showed that for most tasks, the smaller, faster, cheaper model is good enough.
Six months ago, you needed Sonnet for serious coding work. Today, you don't. The gap between the flagship and the small model is closing faster than anyone expected.
I switched to Haiku after trying it myself. Watching a real team adopt it and stick with it confirms what I suspected: this isn't just a personal preference. It's a shift in what's actually required to get work done.
You should try it. Your team probably should too.

