2 posts tagged sparse-autoencoders.
We're not making AI more transparent, we're just building better debugging tools for black boxes.
Sparse autoencoders that turn LLM black-box internals into interpretable features you can actually use.