News & Updates

AI digest: Models go local, agents go mainstream

· 2 min read

Powerful models shrink to laptop size while AI agents get proper desktop apps and enterprise monitoring.

The week’s big theme is AI going from lab curiosity to everyday tool. Models are getting small enough for your laptop, agents are getting proper UIs, and companies are figuring out how to monitor it all.

Google’s Gemma 4 fits proper multimodal AI on a 16GB laptop

Google DeepMind released Gemma 4 12B, an encoder-free model that handles text, images and audio natively while running locally under Apache 2.0. The clever bit is feeding vision and audio straight into the LLM backbone without separate encoders. This matters because it’s the first time we’ve seen properly capable multimodal AI that doesn’t need cloud compute or massive hardware.

Nvidia drops Cosmos 3 for physical AI

Nvidia’s Cosmos 3 combines an autoregressive VLM with a diffusion generator to handle physical reasoning, world generation and action planning in one model. It’s their bid to unify all the messy bits of robotics AI into a single foundation model. Whether this actually works better than specialised models remains to be seen, but the ambition is impressive.

Hermes Desktop brings AI agents to the masses

Nous Research launched Hermes Desktop, a proper GUI for their Hermes Agent that shares the same core, skills and memory as the CLI version. Finally, someone’s built an AI agent app that doesn’t require terminal wizardry. The streaming tool output is particularly neat for watching what the agent is actually doing.

Coralogix raises $200M to watch AI agents

Coralogix secured $200M betting that companies will need serious monitoring tools as AI systems move into production. Smart timing, because enterprises are starting to deploy agents at scale and realising they have no idea what they’re actually doing. Someone needs to watch the watchers.

Related