A TurboQuant ready llamacpp with gfx906 optimizations for gfx906 users.

A TurboQuant ready llamacpp with gfx906 optimizations for gfx906 users.

So this is my take on the TurboQuant trend. Its another llamacpp fork, it's vibe coded, but it work like a charm for me so it may interest some. Currently adding Gemma4 architecture support, it will come soon. I am not really aware of benchmark standard in this comunity so feel free to suggest.

submitted by /u/Exact-Cupcake-2603
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top