AI digest: production reality bites
This week's stories show AI hitting the messy realities of production work, from LoRA's hidden assumptions to investment bankers rejecting every AI output.
The gap between AI demos and production keeps widening. This week brought sobering reality checks alongside the usual model releases.
LoRA fine-tuning breaks when tasks get complex
MarkTechPost’s deep dive reveals why LoRA works brilliantly for style changes but fails spectacularly for complex reasoning tasks. The technique assumes all model updates are similar and low-dimensional, which simply isn’t true when you’re trying to teach new skills rather than tweaking tone. This explains why so many production fine-tuning projects hit walls that nobody saw coming.
Investment bankers reject all AI outputs as client-ready
A brutal new benchmark tested top models on actual junior banker tasks and found zero outputs ready for client delivery. Even GPT-5.4 and Claude Opus 4.6 couldn’t meet professional standards for financial analysis work. This isn’t about the models being bad, it’s about the massive gap between “impressive demo” and “I’ll stake my reputation on this output.”
GPT-5.5 tops benchmarks but costs 20% more
OpenAI’s latest model dominates the leaderboards but still hallucinates frequently and costs significantly more to run. They’ve also killed Codex again, folding it back into the main model, and warn developers to start prompting from scratch rather than carrying over old techniques.
Programming job growth halves since ChatGPT
A Federal Reserve study found US programmer job growth nearly halved since ChatGPT launched. The data’s stark but unsurprising given how much coding work has shifted to AI assistance. The question now is whether this stabilises or accelerates further.