Memory is the new context window

We’re watching the death of the ephemeral AI assistant. Every major release now ships with some form of persistent memory, from Anthropic’s Claude to the new wave of agent frameworks. The industry has finally figured out what anyone who’s worked with a forgetful colleague already knew: intelligence without memory is just expensive pattern matching.

The infrastructure follows the paradigm

This shift rewrites the entire stack. Instead of optimising for token throughput, we’re building for state persistence. Memory management becomes as critical as model weights. The economics change too, from pay-per-prompt to subscription models that can amortize the cost of maintaining context across sessions.

Look at the recent agent releases. They all solve the same core problem: how do you maintain coherent state across interactions? The technical approaches vary, but the goal is identical.

But memory creates new problems

Persistent agents introduce complications that stateless models never had. Who owns the memories? How do you version control an agent’s evolving worldview? What happens when an agent’s accumulated context becomes poisoned or outdated?

We’re trading the simplicity of clean slate interactions for the messiness of ongoing relationships. That’s probably the right trade, but we’re still learning how to manage it at scale.

On-device inference just turned the cloud into expensive nostalgia 4 Jun Tool search just turned agent frameworks into library management systems 30 May Modular training just turned neural networks into LEGO blocks 28 May

The infrastructure follows the paradigm

But memory creates new problems

Related