Can we already use Google’s TurboQuant (TQ) for KV Cache in llama-server? Or are we waiting for a PR?
Hey everyone, Ever since the day Google announced TurboQuant, I've been following the news about its extreme compression capabilities without noticeable quality degradation. I see it mentioned constantly on this sub, but despite all the discussions…