do GLM-4.7 Flash Q4_K_M have problem with claude or agent?

do GLM-4.7 Flash Q4_K_M have problem with claude or agent?

I'm brand new to local LLMs and started with GLM-4.7 Flash q4_K_M.

When I run it directly:

ollama run glm-4.7-flash:q4_K_M

it works pretty decently — nothing amazing, but usable and responsive.

The problem starts when I switch to the Claude interface with:

ollama launch claude --model glm-4.7-flash:q4_K_M

Suddenly the model feels super dumb. It has basically zero memory between messages, can't create/save files, and forgets everything from the previous turn.

Concrete example:

  • I asked it to “build a CLI Snake game in Python”. It gave me clean, working code.
  • Then I said “now create the file in the current folder”. It had **no idea** what Snake game I was talking about and started from scratch like it was a brand new chat.
  • i used this prompt(in the pictures) in the first of chat to make it create but it did not create code file even he said it "Files created successfully"
  • another thing if i give it super prompt it will like take so much time (+10min) to give me response (response mostly will be stopped random with out full answer ) and maybe do not give me another at all.

i used model (GLM) in continue.div in VS-code and it work fine in chat mode but in agent mode it did not work.

Questions:

  1. Should I just upgrade to a stronger model? (I have 32 GB RAM + 6 GB VRAM GPU + OS-LINUX-fedora)
  2. Am I using the model wrong? I thought the “Claude” launcher was the way to get tool use / skills / file creation, but maybe that interface is not meant for this small model?
submitted by /u/Agent0o6
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top