AI digest: Open models flex muscle

Big week for open models, with NVIDIA leading the charge. Meanwhile, the industry’s wrestling with spiralling costs and figuring out what comes after chatbots.

NVIDIA releases 550B open model for agents

NVIDIA dropped Nemotron 3 Ultra, a 550B parameter mixture-of-experts model designed for long-running agents. It’s got a 1 million token context window and runs 6x faster than comparable models. The fact they’re releasing weights, training data, and recipes shows NVIDIA’s betting hard on open ecosystems rather than keeping everything locked up.

Stanford builds truly local AI with OpenJarvis

Stanford researchers released OpenJarvis, a framework that runs everything on-device including inference, memory, and learning. It performs within 3.2 points of cloud models at 800x lower cost. This matters because it shows you don’t need massive server farms for decent AI agents anymore.

Google squeezes multimodal AI onto laptops

Google’s Gemma 4 12B can process text, images, and audio whilst running on just 16GB of RAM. It nearly matches their 26B model despite being half the size. Shows the race isn’t just about bigger models anymore, it’s about efficiency.

Companies missing AI savings targets

A Bain study found 40% of companies achieved less than 10% cost savings from AI, despite targeting much higher. Turns out humans keep getting in the way of automation. Not surprising, but puts a damper on all the AI ROI hype we’ve been hearing.

AI digest: Models get faster, companies get desperate 11 Jun AI digest: Speech models and code tooling hit production 10 Jun AI digest: agents get serious, speed breaks records 9 Jun

NVIDIA releases 550B open model for agents

Stanford builds truly local AI with OpenJarvis

Google squeezes multimodal AI onto laptops

Companies missing AI savings targets

Related