News & Updates

AI digest: agents get serious, speed breaks records

AI agents prove they can work autonomously for real, trillion-parameter models hit crazy speeds, and OpenAI files for IPO.

The gap between AI demos and reality keeps shrinking. This week brought proper evidence that agents can do meaningful work, not just chat.

AI agents actually work for 26 minutes straight

Harvard and Perplexity tested autonomous agents against search assistants and found agents work for 26 minutes per session versus 33 seconds for search. That’s not just a marginal improvement, it’s a completely different category of capability. We’re finally seeing agents that can tackle proper multi-step tasks without constant hand-holding.

Xiaomi pushes trillion-parameter models past 1000 tokens per second

Xiaomi’s MiMo team achieved over 1000 tokens per second on a trillion-parameter model using just one 8-GPU node. That’s properly fast inference on commodity hardware, not exotic setups. This kind of speed makes real-time applications with massive models actually feasible.

OpenAI files for IPO as the AI race heats up

OpenAI filed confidentially for an IPO, following Anthropic’s filing last week. The timing feels significant with both companies rushing to go public. Either they’re confident about sustained growth or they want to raise capital before the market gets saturated. Either way, it signals the AI industry is maturing fast.

Google upgrades RAG with persistent context agents

Google Research added agentic RAG to Gemini Enterprise with agents that keep searching until they have enough context for complex queries. The 34% accuracy improvement over standard RAG shows how much better AI gets when it can determine its own information needs rather than accepting whatever the first search returns.

Related