LocalLLaMA

Qwen 3.6 35B different quant speeds ?

/u/cviperr33 / April 19, 2026

https://preview.redd.it/bixb4erga2wg1.png?width=1464&format=png&auto=webp&s=2df10ab305a5cf4c4252496ec3df34422359066b This is on RTX 3090 , llama.ccp main , linux arch. So what is everybody's experience so far , ive tested a few qu…

LocalLLaMA

What starts to become possible with two 3090s that wasn’t with just one?

/u/GotHereLateNameTaken / April 19, 2026

qwen 3.6 has been working great and has got me wondering. submitted by /u/GotHereLateNameTaken [link] [comments]

LocalLLaMA

Intel Arc B70 with HP z640 workstation (pcie 3)

/u/Serious_Rub_3674 / April 19, 2026

First-time local LLM user here! I’m running an old HP Z640 workstation with a dual Xeon E5-V4 setup (around 100GB of RAM). It used to have a Titan X Pascal GPU, but I swapped it out for an Arc B70. I’m not sure if the motherboard supports PCI rebar, bu…

LocalLLaMA

Qwen 3.6 CoT issue?

/u/Confident_Ideal_5385 / April 19, 2026

So the Qwen vocab has distinct tokens for <think> and </think>. I know this because an app I wrote pushes those tokens to the cache after <|im_start|>assistant to stop CoT selectively. Great. Yesterday I was fucking around with some c…

LocalLLaMA

I tested 8 LLMs as tabletop GMs – a 27B model beat the 405B on narrative quality

/u/Bobby_Gray / April 19, 2026

Sum B+a+c+k+g+r+o+u+n+d: I've been working on an open source agentic tabletop GM as a leisure project intended to run on any LLM with tool support. I started it as a Claude Code skill to run D&D sessions and eventually generalized it to b…

LocalLLaMA

For chat and Q&A: Which MoE model is better: Qwen 3.6 35B or Gemma 4 26B (no coding or agents)

/u/br_web / April 19, 2026

Thanks submitted by /u/br_web [link] [comments]

LocalLLaMA

Best use cases for a mismatched RTX 3090 (24GB) + RTX 3060 (12GB) setup?

/u/chucrutcito / April 19, 2026

Hey everyone, I have a system with 32GB of system RAM and two GPUs: RTX 3090 (24GB) in the primary fast PCIe slot RTX 3060 (12GB) in a secondary, slower PCIe slot I'm assuming that splitting a single large model across both cards is a bad idea b…

LocalLLaMA

Reachy Mini, amazing to build with the kid, painful experience with the applications

/u/pedroserapio / April 19, 2026

I was super curious about the Reachy Mini, got one and during this weekend me and my 12 years kid put the pieces together, just followed the manual that came with the robot. Very easy to read, clear diagrams and instructions, we just did it really quic…

LocalLLaMA

I’m running qwen3.6-35b-a3b with 8 bit quant and 64k context thru OpenCode on my mbp m5 max 128gb and it’s as good as claude

/u/Medical_Lengthiness6 / April 19, 2026

of course this is just a trust me bro post but I've been testing various local models (a couple gemma4s, qwen3 coder next, nemotron) and I noticed the new qwen3.6 show up on LM Studio so I hooked it up. VERY impressed. It's super fast to respon…

LocalLLaMA

Deep dive into LangGraph’s Pregel execution model, checkpointing internals, and DeepAgents

/u/laxmena / April 18, 2026

Wrote a long-form technical post on what’s actually happening under the LangGraph API. The main insight that surprised me: LangGraph’s StateGraph is a high-level abstraction over a Pregel runtime. The real primitives are actors (PregelNodes) and channe…