AI agents are overhyped but also inevitable

Every startup is an “AI agent” company now. Most of them are a chatbot that calls an API and pretends that’s autonomy. It’s not.

A real agent makes decisions, takes actions, handles errors, and knows when to stop. Most of the demos we’ve seen fail the “what happens when something goes wrong” test. They either crash, loop forever, or confidently do the wrong thing.

But

The underlying capability is real. Claude with tool use can genuinely reason about multi-step tasks. Give it the right tools and clear constraints and it’ll work through problems methodically. We’ve built agent loops that surprised us with how well they handled edge cases.

The gap

What’s missing is reliability. An agent that works 90% of the time is a toy. An agent that works 99.9% of the time is a product. We’re somewhere around 85% right now, depending on the task complexity.

The hype cycle will crash. Some startups will fold. And then, quietly, agents will start actually working. That’s usually how these things go.