News & Updates

AI digest: Agents get serious

OpenAI ships GPT-5.5 for autonomous work, Google fixes distributed training, and agents start learning from their mistakes.

Three big moves this week show AI systems getting more autonomous and reliable.

OpenAI releases GPT-5.5 as fully agentic model

OpenAI’s new GPT-5.5 targets the full stack of computer work without human supervision. The model scores 82.7% on Terminal-Bench 2.0 and 84.9% on GDPval, which suggests it can actually handle coding, research, and data analysis end-to-end. This feels like the first serious attempt at making agents that don’t need babysitting every step.

Google’s DiLoCo architecture survives hardware failures

DeepMind’s new training system achieves 88% goodput even when chips fail constantly. Training massive models has been a coordination nightmare where one slow chip kills the whole run. DiLoCo’s asynchronous approach could be the breakthrough that makes distributed training actually reliable at scale.

Google’s ReasoningBank teaches agents to learn from failure

The new memory framework distils reasoning strategies from both successful and failed agent experiences. Most current agents forget everything after each task. ReasoningBank gives them persistent memory to build up problem-solving patterns over time, which could be the missing piece for agents that genuinely improve.

Google says 75% of its code is now AI-generated

Three-quarters of new code at Google comes from AI, then gets reviewed by humans. This isn’t just autocomplete anymore. It suggests we’re hitting a tipping point where AI writes most code and humans become editors and architects.

Related