AI digest: production reality bites

The gap between AI demos and production keeps widening. This week brought sobering reality checks alongside the usual model releases.

LoRA fine-tuning breaks when tasks get complex

MarkTechPost’s deep dive reveals why LoRA works brilliantly for style changes but fails spectacularly for complex reasoning tasks. The technique assumes all model updates are similar and low-dimensional, which simply isn’t true when you’re trying to teach new skills rather than tweaking tone. This explains why so many production fine-tuning projects hit walls that nobody saw coming.

Investment bankers reject all AI outputs as client-ready

A brutal new benchmark tested top models on actual junior banker tasks and found zero outputs ready for client delivery. Even GPT-5.4 and Claude Opus 4.6 couldn’t meet professional standards for financial analysis work. This isn’t about the models being bad, it’s about the massive gap between “impressive demo” and “I’ll stake my reputation on this output.”

GPT-5.5 tops benchmarks but costs 20% more

OpenAI’s latest model dominates the leaderboards but still hallucinates frequently and costs significantly more to run. They’ve also killed Codex again, folding it back into the main model, and warn developers to start prompting from scratch rather than carrying over old techniques.

Programming job growth halves since ChatGPT

A Federal Reserve study found US programmer job growth nearly halved since ChatGPT launched. The data’s stark but unsurprising given how much coding work has shifted to AI assistance. The question now is whether this stabilises or accelerates further.

AI digest: Models get faster, companies get desperate 11 Jun AI digest: Speech models and code tooling hit production 10 Jun AI digest: agents get serious, speed breaks records 9 Jun

LoRA fine-tuning breaks when tasks get complex

Investment bankers reject all AI outputs as client-ready

GPT-5.5 tops benchmarks but costs 20% more

Programming job growth halves since ChatGPT

Related