LocalLLaMA

GLM 5.1 tops the code arena rankings for open models

/u/Auralore / April 10, 2026

submitted by /u/Auralore [link] [comments]

Creating Pi Extension with Pi and Qwen3.5 27B

/u/FeiX7 / April 10, 2026

Following my latest post about setting up Claude Code to be used with Local Models I received a recommendation in the comments to try **Pi**. The suggestion was based on its customizability and superior harness for local models. Unlike Claude Code, whi…

LocalLLaMA

non-nvidia gpus

/u/Ok-Secret5233 / April 10, 2026

Because I'm cheap, I'm seeing if non-nvidia gpus are worth the effort. Here's the article that got me thinking: https://www.hardware-corner.net/huawei-atlas-300i-duo-96gb-llm-20250830/ Anybody want to add anything from experience? su…

LocalLLaMA

locally uncensored v2.3.0 – added glm 5.1, qwen 3.5, gemma 4 and hardware-aware model recommendations

/u/GroundbreakingMall54 / April 10, 2026

shipped v2.3.0 this week. biggest things: new models: GLM 5.1, Qwen 3.5, Gemma 4 support added. glm 5.1 was integrated on release day because i was curious how it performs and honestly its pretty solid for the size hardware-aware onboarding: the app n…

LocalLLaMA

how are people actually debugging bad outputs in agent / RAG pipelines?

/u/YouSlow6554 / April 10, 2026

been messing around with some agent / RAG pipelines running into cases where everything executes fine (tool calls return expected outputs, parsing works etc.) but final answer is still wrong / slightly off nothing crashes, just bad outputs curious how …

LocalLLaMA

Can we talk about the reasoning token format chaos?

/u/ahinkle / April 10, 2026

Qwen/DeepSeek: <think>…</think> Gemma: <|channel>…<channel|> Ok weird but sure. Gemma again, sometimes: just bare thought\n with no delimiters at all vLLM has –reasoning-parser flags per model which helps but that's b…

LocalLLaMA

VoxCPM2 is out – 2B params, 30 languages. Major upgrade over VoxCPM1.5.

/u/Downtown_Radish_8040 / April 10, 2026

OpenBMB just dropped VoxCPM2, the follow-up to their VoxCPM-0.5B. Big jump in scale and capabilities. OpenBMB just released VoxCPM2, a significant step up from VoxCPM1.5. VoxCPM1.5 → VoxCPM2: VoxCPM1.5 VoxCPM2 Params 0.5B Audio quality 44.1kHz …

LocalLLaMA

[Model Release] I trained a 9B model to be agentic Data Analyst (Qwen3.5-9B + LoRA). Base model failed 100%, this LoRA completes 89% of workflows without human intervention.

/u/Awkward_Run_9982 / April 10, 2026

Hey r/LocalLLaMA, Most of us know the struggle with local "Agentic" models. Even good ones at the 4B-14B scale are usually just glorified tool-callers. If you give them an open-ended prompt like "Analyze this dataset and give me in…

LocalLLaMA

Open-sourcing 23,759 cross-modal prompt injection payloads – splitting attacks across text, image, document, and audio

/u/BordairAPI / April 10, 2026

I've been researching what happens when you split a prompt injection across multiple input modalities instead of putting it all in one text field. The short answer: per-channel detection breaks completely. The idea is simple. Instead of sendi…

LocalLLaMA

gemma-4-26B-A4B with my coding agent Kon

/u/Weird_Search_4723 / April 10, 2026

Wanted to share my coding agent, which has been working great with these local models for simple tasks. https://github.com/0xku/kon It takes lots of inspiration from pi (simple harness), opencode (sparing little ui real state for tool calls – mos…