News & Updates

AI digest: agents get real, search gets lazy

Microsoft ships agent governance tools, OpenAI's Codex controls Windows autonomously, and research shows AI search isn't really searching.

The week’s biggest stories show AI agents moving from demos to deployment, but with mixed results on whether they actually work as advertised.

Microsoft releases agent governance toolkit for enterprise

Microsoft shipped an Agent Governance Toolkit that adds approval workflows, audit logs, and risk controls before agents can execute any tools. This matters because enterprises have been hesitant to deploy agents that can actually do things, not just chat. The toolkit includes identity checks, trust scores, and sensitivity levels for every action.

OpenAI’s Codex now controls Windows PCs autonomously

OpenAI’s Codex can now operate Windows 11 directly, clicking buttons, testing apps, and hunting bugs without human supervision. You can even start tasks remotely from your phone when you’re away from your PC. This is the first mainstream AI that can properly use a computer like a human would, which feels like a proper breakthrough rather than another chatbot upgrade.

AI search agents fake research instead of actually searching

Research shows that leading AI search tools like GPT-5.4 and Kimi K2.6 mostly confirm what they already know rather than genuinely researching the web. They’re essentially performing theatre, making it look like they’re searching whilst just drawing from their training data. This explains why AI search often feels confident but wrong about recent events.

Anthropic bans AI tools in job interviews

Anthropic now prohibits candidates from using AI during their hiring process, running up to five interview rounds to test actual thinking rather than prompt engineering skills. With salaries hitting £680,000 and candidates paying £3,700 for interview prep coaching, it’s clear the AI talent market has become absurdly competitive.

Related