AI digest: Agent invasion accelerates
Major players ship agent tools while Google doubles down on multimodal AI.
This week feels like the agent wars properly kicking off. Everyone’s shipping tools to build AI workers.
Google launches Gemini 3.1 Flash TTS with 70+ languages
Google dropped Gemini 3.1 Flash TTS, a text-to-speech model supporting over 70 languages with natural audio tags for controlling style and pace. This isn’t just about better voices, it’s about making AI interfaces feel properly conversational. The multi-speaker dialogue support suggests Google’s thinking about AI agents that can handle complex interactions.
OpenAI updates its Agents SDK for enterprise safety
OpenAI expanded its agent toolkit with sandbox support and new safety features for enterprise developers. Smart move given how nervous companies are about AI agents running wild in their systems. The timing suggests they’re racing to get enterprise-grade agent infrastructure out before competitors lock up the market.
Adobe turns Creative Suite into an AI agent
Adobe launched Firefly AI Assistant, which can orchestrate tasks across Photoshop, Premiere, and other Creative Cloud apps through chat. This feels like the first proper attempt at turning professional software into an AI-driven workflow. If it works well, it could completely change how creatives interact with their tools.
Google ships robotics model with enhanced reasoning
Gemini Robotics-ER 1.6 focuses on embodied reasoning for robots, including visual understanding and task planning. Google’s clearly betting that the next AI breakthrough won’t just be in chat, but in physical world interaction. The instrument reading capability suggests they’re targeting industrial applications first.