1 post tagged memory-optimisation.
Google's TurboQuant and the rush to compress KV caches are treating symptoms whilst ignoring the real problem.