The Radar
Sunday, 12 April 2026
Today's picks
MiniMax M2.7
AI AgentsSelf-evolving agent model that scores 56.22% on SWE-Pro and 57.0% on Terminal Bench.
MiniMax just open-sourced their most capable model yet, and it's the first to actively participate in its own development cycle. The benchmarks are solid but the self-evolution angle is what makes this interesting. We're seeing models that can improve themselves rather than just follow instructions.
LFM2.5-VL-450M
AI Infrastructure450M-parameter vision-language model with sub-250ms edge inference and bounding box prediction.
Liquid AI has cracked the code on proper edge vision models with this 450M parameter update. Sub-250ms inference on Jetson Orin hardware with bounding box prediction and function calling is properly useful. Most vision models are still too heavy for real edge deployment, but this one actually works where it matters.
Also on the radar
Hacker News
How We Broke Top AI Agent Benchmarks: And What Comes Next
371 pts 94 commentsBerkeley researchers expose fundamental flaws in current AI agent benchmarks, showing how they can be gamed. This matters because the entire industry is optimising for metrics that don't reflect real-world agent capability. The post outlines what trustworthy agent evaluation actually looks like.
Ask HN: Do you trust AI agents with API keys / private keys?
5 pts 5 commentsCommunity discussion about the security implications of giving AI agents access to sensitive credentials. A practical question that every team building with agents faces. The answers reveal how early we still are in figuring out agent security patterns.
Show HN: MCP is for tools. A2A is for agents. What's for websites?
4 pts 0 commentsA proposal for standardising how AI agents interact with websites, building on existing protocols like MCP and A2A. The creator is trying to solve the missing piece in agent-web interaction. Early stage but addresses a real gap in the agent ecosystem.