Anyone got Gemma 4 26B-A4B running on VLLM?

If yes, which quantized model are you using abe what’s your vllm serve command?

I’ve been struggling getting that model up and running on my dgx spark gb10. I tried the intel int4 quant for the 31B and it seems to be working well but way too slow.

Anyone have any luck with the 26B?

submitted by /u/toughcentaur9018
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top