Gemma 4 – Going Mad – – – Help!!!

Hi All

Im getting up to speed on LLMs and we are looking at Gemma4.
We are using a M3 Ultra with 512GB VRAM. So no dangers there.

Im using opencode cli for these tests. However it doesnt appear to matter what I use the results are the same. Its all around tooling.

I have re-downloaded all the models this morning post the fixes. These are the unsloth ones.

Im running llama.cpp - which i build on the server and is bang up to date.

So in opencode CLI - if i give it this prompt - its runs, does each one all fantastic....

tell me all the background colours in use on the homepage tell me how many tests are in this system run all tests and feedback on any failures

However if I do this:

- [] tell me all the background colours in use on the homepage - [] tell me how many tests are in this system - [] run all tests and feedback on any failures

It fails. Get the red error of doom:

~ Updating todos...

The todowrite tool was called with invalid arguments: [

{

"expected": "array",

"code": "invalid_type",

"path": [

"todos"

"message": "Invalid input: expected array, received string"

}

Please rewrite the input so it satisfies the expected schema.

The params I launched the server is are:

llama-server --model /Users/user/LLM_Models/gemma-4-31B-it-UD-Q5_K_XL.gguf \

--port 8002 \

--ctx-size 202752 \

--parallel 2 \

--n-gpu-layers 999 \

--cache-type-k bf16 \

--cache-type-v bf16 \

--flash-attn on \

--threads 16 \

--threads-batch 16 \

--temperature 1 \

--top-p 0.95 \

--top-k 64 \

--min-p 0.01 \

--reasoning off \

--host 0.0.0.0 \

--mlock

Im access this via tailscale.

Please note im experiementing with all the Gemma models, this might not be the one we use moving forwards, so no need to highlight that!

Please can anyone tell me what on earth im doing wrong!!!

submitted by /u/matyhaty
[link] [comments]

Leave a Comment