#latency
3 posts tagged latency.
Thoughts
Cross-datacenter inference just split the monolith that never should have been one
Breaking prefill and decode across datacenters isn't innovation, it's just fixing a fundamental architectural mistake.
Latency budgets are the new Moore's law
Voice interfaces are forcing us to optimise for milliseconds instead of parameters, and it's changing everything about how we build AI systems.
Real-time voice models just made chatbots obsolete
The shift to live audio processing isn't an upgrade to existing chat interfaces - it's their complete replacement.