LocalLLaMA

The case for AI “Cooperatives”

/u/nunodonato / April 16, 2026

submitted by /u/nunodonato [link] [comments]

Those of you running minimax 2.7 locally, how are you feeling about it?

/u/laterbreh / April 16, 2026

Im running the raw version straight from the minimax release on hugging face (https://huggingface.co/MiniMaxAI/MiniMax-M2.7) on 3 rtx pro 6000's on vllm. So no quantization. And i'm not going to lie something feels off about it. Same workloads …

LocalLLaMA

GPoUr with ~12gb vram and a 3080 getting 40tg/s on qwen3.6 35BA3B w/ 260k ctx

/u/herpnderpler / April 16, 2026

The TheTom's turboquant's GPU accelerated turboquant (turbo3) has unlocked high context gains for the 35BA3B family. I can now achieve ~40tg/s via the following GPU-POOR compilation flags and configuration: cmake -B build -DGGML_CUDA=ON -DGGML_…

LocalLLaMA

PSA: Qwen3.6 ships with preserve_thinking. Make sure you have it on.

/u/onil_gova / April 16, 2026

I had previously posted here about a fix to their 3.5 template to help resolve the KV cache invalidation issue from their template. A lot of you found it useful. Qwen 3.6 now addresses this with a new preserve_thinking flag. From their model page…

LocalLLaMA

Google, please just open source Imagen (2022), Gemini 1.0 Nano and Gemini 1.0 Pro. You have nothing to lose at this point.

/u/Ok-Type-7663 / April 16, 2026

Ok, so imagen (the original one from 2022, not imagen 3/4) should be open source. The gemini 1.0 nano model and the gemini 1.0 pro models should be open source. xAI already open-sourced grok 1, but Google???????? at this point you should open source th…

LocalLLaMA

Impressed with Qwen3.6-35B-A3B

/u/DOAMOD / April 16, 2026

I'm very impressed with this model so far. I've been working with it for a hour or two, and it's a machine. It leaves no stone unturned, studying and digging deep into the code to find the problem. The summaries it creates are very detailed…

LocalLLaMA

Replaced an LLM’s text generation head with one that emits raw machine opcodes. Here are my findings

/u/ilbert_luca / April 16, 2026

Follow-up to my previous post about why AI agents should not control machines through text. The idea: every AI agent today generates human text, parses it, then executes it. That's like controlling a robot arm by dictating English. Tesla FSD …

LocalLLaMA

Only LocalLLaMa can save us now.

/u/kaggleqrdl / April 16, 2026

The data has been slowly building up and points to a very likely economic and rational conclusion : Anthropic is effectively constructively terminating its Max subscription plans with the eventual goal of an enterprise-first (or only) focus, planning …

LocalLLaMA

The only metric that matters: "[Qwen3.6-35B-A3B-GGUF] drew a better pelican riding a bicycle than Opus 4.7 did!"

/u/johnnyApplePRNG / April 16, 2026

submitted by /u/johnnyApplePRNG [link] [comments]

LocalLLaMA

Its just a new Qwen model

/u/StandardLovers / April 16, 2026

what about phi series from Microsoft? It used to be really good. submitted by /u/StandardLovers [link] [comments]