TurboQuant on Apple Silicon: real benchmarks on Mac Mini M4 16GB and M3 Max 48GB
I’ve been testing TurboQuant this week on two machines and wanted to share the actual numbers. Why this matters: TurboQuant compresses the KV cache, not the model weights. On long contexts, KV cache can take several GB of memory, so reducing it c…