LocalLLaMA

LocalLLaMA

Gemma4 issue with winogrande bench

gemma-4-26B-A4B-it-Q4_K_M can only get around 50% acc on winogrande-debiased-eval.csv with llama-perplexity. Meanwhile qwen3.5-35B-A3B-IQ4_NL can get about 75%+ acc. However, in real-world tasks, the Gemma 4 model performs very well. Why does this disc…

Scroll to Top