I spent a few hours trying out the new Gemma 4 models, and one thing that stood out pretty quickly — the difference between sizes is more noticeable than I expected.
Didn’t run any formal benchmarks, just hands-on usage.
Tested:
- Gemma-4-26B-A4B-it
- Gemma-4-31B-it
Mostly used them for:
- some coding (Python + small scripts)
- general prompts
- a bit of longer / slightly more complex instructions
🧠 31B (Gemma-4-31B-it)
This one feels a lot more stable once prompts get even a little complex.
- Better at following multi-step instructions
- Less likely to drift or “lose the thread”
- Coding outputs were more consistent
For simple stuff, it doesn’t feel massively different. But as soon as you stack a few requirements together, the gap shows up pretty clearly.
Downside is just what you’d expect: slower and more expensive.
⚡ 26B (Gemma-4-26B-A4B-it)
This one actually surprised me.
- Very fast and responsive
- Totally fine for most day-to-day use
- Feels good for quick testing / iteration
It does start to break down a bit on more layered prompts or when you need tighter reasoning, but nothing unexpected.
I ran both in a hosted notebook setup just to save time on local config.
Curious if others are seeing the same kind of gap, or if this depends a lot on the setup/use case.
[link] [comments]