AI digest: Agents get serious

Three big moves this week show AI systems getting more autonomous and reliable.

OpenAI releases GPT-5.5 as fully agentic model

OpenAI’s new GPT-5.5 targets the full stack of computer work without human supervision. The model scores 82.7% on Terminal-Bench 2.0 and 84.9% on GDPval, which suggests it can actually handle coding, research, and data analysis end-to-end. This feels like the first serious attempt at making agents that don’t need babysitting every step.

Google’s DiLoCo architecture survives hardware failures

DeepMind’s new training system achieves 88% goodput even when chips fail constantly. Training massive models has been a coordination nightmare where one slow chip kills the whole run. DiLoCo’s asynchronous approach could be the breakthrough that makes distributed training actually reliable at scale.

Google’s ReasoningBank teaches agents to learn from failure

The new memory framework distils reasoning strategies from both successful and failed agent experiences. Most current agents forget everything after each task. ReasoningBank gives them persistent memory to build up problem-solving patterns over time, which could be the missing piece for agents that genuinely improve.

Google says 75% of its code is now AI-generated

Three-quarters of new code at Google comes from AI, then gets reviewed by humans. This isn’t just autocomplete anymore. It suggests we’re hitting a tipping point where AI writes most code and humans become editors and architects.

AI digest: Models get faster, companies get desperate 11 Jun AI digest: Speech models and code tooling hit production 10 Jun AI digest: agents get serious, speed breaks records 9 Jun

OpenAI releases GPT-5.5 as fully agentic model

Google’s DiLoCo architecture survives hardware failures

Google’s ReasoningBank teaches agents to learn from failure

Google says 75% of its code is now AI-generated

Related