Comparing Qwen3.5 27B vs Gemma 4 31B for agentic stuff

Comparing Qwen3.5 27B vs Gemma 4 31B for agentic stuff

Models compared:

  • Qwen3.5-27B-UD-Q5_K_XL
  • gemma-4-31B-it-UD-Q5_K_XL

Main flags for boths

--flash-attn on \

--n-gpu-layers 99 \

--no-mmap \

-c 150000 \

--temp 1 --top-p 0.9 --min-p 0.1 --top-k 20 \

--ctx-checkpoints 1 \

--jinja \

-np 1 \

--reasoning on \

--mmproj 'mmproj-BF16.gguf' \

--image-min-tokens 300 --image-max-tokens 512

I know they may not be the best and I still need more experiments (thank you u/Sadman782) I find these tests fun and interesting.

Model Observations
Qwen3.5-27B-UD-Q5_K_XL More steps, checks env var, corrects its fails to fully address the requests so final results is good (in the example, the telegram message is perfect), sometimes create a python script instead of bash only
gemma-4-31B-it-UD-Q5_K_XL More direct (smarter to finds urls) but may miss the final goal (in this example the telegram message was truncated

Please let me know if you need more tests.

https://preview.redd.it/281gn3pddzug1.png?width=1827&format=png&auto=webp&s=7ced859b3cac05ea8fddd0c2ce7a3ea54c9f046b

https://preview.redd.it/nxzhv4pddzug1.png?width=1827&format=png&auto=webp&s=b0ad50fdff1fe9615fd4794040173391ba72fe76

submitted by /u/takoulseum
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top