AI digest: Open models flex muscle
NVIDIA drops a massive open model whilst everyone else scrambles to keep costs under control.
Big week for open models, with NVIDIA leading the charge. Meanwhile, the industry’s wrestling with spiralling costs and figuring out what comes after chatbots.
NVIDIA releases 550B open model for agents
NVIDIA dropped Nemotron 3 Ultra, a 550B parameter mixture-of-experts model designed for long-running agents. It’s got a 1 million token context window and runs 6x faster than comparable models. The fact they’re releasing weights, training data, and recipes shows NVIDIA’s betting hard on open ecosystems rather than keeping everything locked up.
Stanford builds truly local AI with OpenJarvis
Stanford researchers released OpenJarvis, a framework that runs everything on-device including inference, memory, and learning. It performs within 3.2 points of cloud models at 800x lower cost. This matters because it shows you don’t need massive server farms for decent AI agents anymore.
Google squeezes multimodal AI onto laptops
Google’s Gemma 4 12B can process text, images, and audio whilst running on just 16GB of RAM. It nearly matches their 26B model despite being half the size. Shows the race isn’t just about bigger models anymore, it’s about efficiency.
Companies missing AI savings targets
A Bain study found 40% of companies achieved less than 10% cost savings from AI, despite targeting much higher. Turns out humans keep getting in the way of automation. Not surprising, but puts a damper on all the AI ROI hype we’ve been hearing.