Thoughts

Agentic coding models just made software engineering a spectator sport

When models hit 70% on complex coding benchmarks, we're not optimising development anymore, we're just watching machines do our jobs.

The latest coding models are hitting 70% success rates on real-world software engineering tasks. We’re not talking about FizzBuzz or leetcode problems. These are actual GitHub issues that require understanding existing codebases, making architectural decisions, and implementing multi-file changes. The engineering workflow just became a review process.

The benchmark tells the real story

SWE-bench Verified isn’t synthetic. It’s pulled from actual open source repositories where humans struggled with real problems. When a model can resolve 7 out of 10 production issues without human intervention, the job description fundamentally changes. We’re not writing code anymore. We’re prompt engineering and quality assurance.

Engineering becomes curation

The shift is already happening in teams running these systems. Senior developers spend their time defining requirements and reviewing generated solutions rather than implementing them. Junior developers are finding their entry path blocked because the learning curve now requires understanding both traditional software engineering and agent orchestration. The career ladder just grew a new bottom rung that most people can’t reach.

We built tools to make us more productive. Instead, we built replacements that make us supervisors. The question isn’t whether these models will get better. It’s whether the remaining 30% of problems they can’t solve will be enough to justify keeping humans in the loop.

Related