AI digest: Edge models and security reality checks
Small models get smarter, researchers tackle memory bottlenecks, and AI security moves from theory to practice.
This week brought practical advances in edge AI and some sobering reminders about security gaps that need fixing now.
Liquid AI shrinks vision-language models down to 450M parameters
Liquid AI’s new LFM2.5-VL-450M packs vision understanding, bounding box detection, and multilingual support into a model that runs locally in under 250ms. This matters because it’s actually small enough for real edge deployment whilst doing genuinely useful multimodal tasks. We’re finally seeing the payoff from all that efficiency research.
TriAttention tackles the KV cache memory wall
Researchers from MIT, NVIDIA, and Zhejiang University developed TriAttention, a compression method that maintains full attention quality whilst achieving 2.5x higher throughput. The KV cache bottleneck is what kills performance in long reasoning chains, so this could unlock much more practical deployment of reasoning models like DeepSeek-R1.
AI models prefer guessing to asking for help
New research shows that when AI models can’t see visual information clearly, they’ll confidently make things up rather than ask users for clarification. Out of 22 models tested on ProactiveBench, almost none asked for missing information. This is exactly the kind of behaviour that breaks trust in production systems, though the researchers hint that reinforcement learning might fix it.
Violence escalates around AI leadership
Someone threw a Molotov cocktail at Sam Altman’s home at 3:45am, prompting a reflective blog post from the OpenAI CEO. Meanwhile, Anthropic is keeping its most capable model private after it found thousands of cybersecurity vulnerabilities across major operating systems. The AI race is getting messier by the week.