AI digest: Agent protocols and model reality checks

This week brought fresh infrastructure for AI agents alongside some humbling moments for the big model makers.

Meta delays Avocado model after failing to match rivals

Meta has postponed its next AI model “Avocado” because internal tests show it can’t keep up with Google and OpenAI. This is a proper reality check for Meta’s AI ambitions - falling behind when you’ve got their resources suggests the frontier is getting harder to reach. The Decoder has the details.

Grok 4.20 sets new record for not hallucinating

xAI’s Grok 4.20 trails behind GPT and Gemini on benchmarks but achieves the lowest hallucination rate of any tested model. That’s actually quite interesting - trading raw performance for reliability could be the right move for many practical applications. Sometimes being consistently correct beats being occasionally brilliant.

ChatGPT’s market dominance shrinks as Gemini gains ground

OpenAI’s market share has dropped from 75.7% to 61.7% over twelve months, with Google Gemini making the biggest gains according to Similarweb data. Still leading but losing ground fast - the chatbot wars are properly heating up.

MCP versus AI agent skills compared

A deep dive into Model Context Protocol versus traditional AI agent skills shows the different approaches to tool integration and behavioural guidance for LLMs. MCP offers more structured external tool access while skills provide embedded behavioural patterns. Worth reading if you’re building agent systems.

Mastercard completes first live AI agent payment

Mastercard pulled off authenticated agent-based payments in Singapore with DBS and UOB banks, moving autonomous commerce from demo to reality. This transaction shows the infrastructure is finally catching up to the hype around AI agents handling real money.

AI digest: Models get faster, companies get desperate 11 Jun AI digest: Speech models and code tooling hit production 10 Jun AI digest: agents get serious, speed breaks records 9 Jun

Meta delays Avocado model after failing to match rivals

Grok 4.20 sets new record for not hallucinating

ChatGPT’s market dominance shrinks as Gemini gains ground

MCP versus AI agent skills compared

Mastercard completes first live AI agent payment

Related