AI digest: Architecture breakthroughs and quantum leaps

This week brought some proper technical advances alongside the usual enterprise noise. Infrastructure innovation is finally catching up to model capabilities.

Cross-datacenter LLM serving breaks the box

Moonshot AI and Tsinghua researchers released PrfaaS, a system that splits prefill and decode across different datacentres. This breaks the assumption that LLM inference needs to happen in one location with high-bandwidth networks. Smart move that could dramatically change how we think about serving models at scale.

NVIDIA launches quantum AI models

NVIDIA dropped NVIDIA Ising, their first open quantum AI model family for hybrid quantum-classical systems. This isn’t lab curiosity anymore. They’re positioning quantum computing as a practical tool rather than future tech, which suggests the hardware gap might be closing faster than expected.

Opus 4.7’s hidden tokeniser tax

Early analysis shows Anthropic’s Opus 4.7 costs significantly more than 4.6 despite flat per-token pricing. The new tokeniser breaks text into up to 47% more tokens. Classic enterprise pricing strategy - keep the unit price the same but change what counts as a unit.

OpenMythos reverse engineers Claude

An open-source project called OpenMythos attempts to reconstruct Claude Mythos architecture from first principles. The 770M parameter model allegedly matches 1.3B transformer performance. Interesting if true, but reverse engineering without access to training details is mostly educated guessing.

AI digest: Models get faster, companies get desperate 11 Jun AI digest: Speech models and code tooling hit production 10 Jun AI digest: agents get serious, speed breaks records 9 Jun

Cross-datacenter LLM serving breaks the box

NVIDIA launches quantum AI models

Opus 4.7’s hidden tokeniser tax

OpenMythos reverse engineers Claude

Related