| tl;dr: Fixes KV-cache rotation for hybrid-attention models like Gemma 4 (Not actually TurboQuant, but you can call it TurboQuant if that makes you feel better) [link] [comments] |
| tl;dr: Fixes KV-cache rotation for hybrid-attention models like Gemma 4 (Not actually TurboQuant, but you can call it TurboQuant if that makes you feel better) [link] [comments] |