Hey guys. Having a weird issue with the new DeepSeek V3.2 Unsloth GGUF via llama-server. The model starts reasoning fine, but the actual opening think tag is missing from the output stream. I just see the plain text reasoning, and then the closing tag at the end.
Because of this, Open WebUI doesn't collapse the thought block. Im on a 512GB box, command is just llama-server -m model_name -t 32 --flash-attn on. Tried toggling reasoning on/off, didn't help.
Is the chat template broken in these specific GGUFs or am I missing a flag?
[link] [comments]