AI digest: Architecture breakthroughs and quantum leaps
Cross-datacenter LLM serving, quantum AI models, and the real cost of model upgrades.
This week brought some proper technical advances alongside the usual enterprise noise. Infrastructure innovation is finally catching up to model capabilities.
Cross-datacenter LLM serving breaks the box
Moonshot AI and Tsinghua researchers released PrfaaS, a system that splits prefill and decode across different datacentres. This breaks the assumption that LLM inference needs to happen in one location with high-bandwidth networks. Smart move that could dramatically change how we think about serving models at scale.
NVIDIA launches quantum AI models
NVIDIA dropped NVIDIA Ising, their first open quantum AI model family for hybrid quantum-classical systems. This isn’t lab curiosity anymore. They’re positioning quantum computing as a practical tool rather than future tech, which suggests the hardware gap might be closing faster than expected.
Opus 4.7’s hidden tokeniser tax
Early analysis shows Anthropic’s Opus 4.7 costs significantly more than 4.6 despite flat per-token pricing. The new tokeniser breaks text into up to 47% more tokens. Classic enterprise pricing strategy - keep the unit price the same but change what counts as a unit.
OpenMythos reverse engineers Claude
An open-source project called OpenMythos attempts to reconstruct Claude Mythos architecture from first principles. The 770M parameter model allegedly matches 1.3B transformer performance. Interesting if true, but reverse engineering without access to training details is mostly educated guessing.