LocalLLaMA

Gemma 4 for 16 GB VRAM

I think the 26B A4B MoE model is superior for 16 GB. I tested many quantizations, but if you want to keep the vision, I think the best one currently is: https://huggingface.co/unsloth/gemma-4-26B-A4B-it-GGUF/blob/main/gemma-4-26B-A4B-it-UD-IQ4_XS.gguf …