#inference
9 posts tagged inference.
News & Updates
Thoughts
On-device inference just turned the cloud into expensive nostalgia
Local AI frameworks are making cloud APIs look like dial-up modems in the broadband era.
Modular training just turned neural networks into LEGO blocks
Block-wise training frameworks are breaking monolithic models into independently trainable components, and it's about to change everything.
Quantisation just turned model deployment into digital archaeology
We're compressing models so aggressively that deployment has become an exercise in reconstructing what the original model was supposed to do.
Memory bandwidth just became the new CUDA bottleneck
Whilst everyone obsessed over compute cores, memory bandwidth quietly became the real constraint choking AI performance.
Reasoning phases are just expensive preprocessing with delusions of intelligence
Adding a 'thinking step' before generation is just prompt engineering disguised as architectural innovation.
On-device inference just became the only game worth playing
Google killing TensorFlow Lite for LiteRT proves the industry has finally picked a side in the deployment wars.