Experiment: Entropy + OLS + SVD for KV cache compression

I’ve been exploring KV cache optimization beyond Top-K pruning.

Observation: pruning fails *selectively* - a few tokens cause large error spikes.

So I tried:

- entropy (selection)
- OLS (reconstruction)
- SVD (compression)

Early results:

- ~3× lower error at low memory
- avoids error spikes
- sometimes even lower memory

Still a prototype - would love feedback, especially where this might break.