AI digest: Military contracts and model decay
OpenAI signs Pentagon deals while Anthropic fights back, plus new research shows even frontier models decay in long conversations.
Big week for AI politics and some sobering technical findings.
OpenAI embraces military contracts as Anthropic fights Pentagon ban
OpenAI signed a deal with the Pentagon for classified AI networks just hours after Anthropic was banned from federal agencies. Anthropic got labelled a “supply chain risk” after refusing to build autonomous weapons and surveillance tools, and they’re taking it to court. The timing feels deliberate, and it’s fascinating to watch these companies take such different stances on military applications.
Even GPT-5 gets worse the longer you chat
New research shows frontier models lose up to 33% accuracy in extended conversations, including GPT-5.2 and Claude 4.6. This isn’t just about context windows, it’s about models genuinely degrading as chats go on. Anyone who’s had a long coding session with Claude will recognise this, but seeing it quantified across the latest models is sobering.
DeepCoder claims O3-mini performance at 14B parameters
A new open-source coding model called DeepCoder supposedly matches O3-mini’s performance with just 14 billion parameters. If true, this could be huge for running capable coding assistants locally. The claims need proper verification, but the trend toward smaller, more efficient models continues.
Banks test agentic AI for trade surveillance
Goldman Sachs and Deutsche Bank are piloting AI agents that reason through trading patterns in real time rather than just following preset rules. This feels like where agentic AI might actually prove its worth, handling complex pattern recognition in regulated environments where the stakes matter.