LocalLLaMA

I’ll take an open-model release over a closed SOTA any day, who’s with me?

/u/be566 / April 16, 2026

submitted by /u/be566 [link] [comments]

Comparison Qwen 3.6 35B MoE vs Qwen 3.5 35B MoE on Research Paper to WebApp

/u/dreamai87 / April 16, 2026

Note: First is Qwen3.5 35B MoE (Left) and Second is Qwen3.6 (Right) Hi Guys Just did quick comparison of Qwen3.6 35B MoE against Qwen 3.5 35B MoE. with reasoning off using llama.cpp and same quant unsloth 4 K_XL GGUF First is Qwen3.5 outcome and…

LocalLLaMA

Web OS result from Qwen3.6 35B is by far the best I tested in my laptop

/u/Idontknow3728 / April 16, 2026

This is my first test with this model and Qwen impressed me. I will rate it 98% usable web os compared to my previous best 70% usable result from qwen3 next coder at q2. Yes I know they train the models on these common prompts yet this is the best resu…

LocalLLaMA

Qwen 3.6: worse adherence?

/u/tkon3 / April 16, 2026

Just swapped Qwen 3.5 for the 3.6 variant (FP8, RTX 6000 Pro) using the same recommended generation settings. My stack is vLLM (v0.19.0) + Open WebUI (v0.8.12) in a RAG setup where the model has access to several document retrieval tools. After some i…

LocalLLaMA

Pixel 9 pro xl non rooted npu teaser

/u/Thr45hx / April 16, 2026

submitted by /u/Thr45hx [link] [comments]

LocalLLaMA

People still don’t really understand what “agent environment engineering” actually is, and how it differs from Agent Harness

/u/Synthetic_Diva_4556 / April 16, 2026

submitted by /u/Synthetic_Diva_4556 [link] [comments]

LocalLLaMA

More reasons to go local: Claude is beginning to require identity verification, including an valid ID like passport or drivers license and a facial recognition scan.

/u/fulgencio_batista / April 16, 2026

submitted by /u/fulgencio_batista [link] [comments]

LocalLLaMA

A new transformer variant has been created to facilitate more efficient model training in distributed settings. 128x compression with no significant loss in convergence rates, increases in memory, or compute overhead

/u/network-kai / April 16, 2026

Macrocosmos has released a paper on ResBM (Residual Bottleneck Models), a new transformer-based architecture designed for low-bandwidth pipeline-parallel training. https://arxiv.org/abs/2604.11947 ResBM introduces a residual encoder-decoder bottleneck …

LocalLLaMA

Anyone else get more excited for new open source models than new flagship ones?

/u/AppleBottmBeans / April 16, 2026

submitted by /u/AppleBottmBeans [link] [comments]

LocalLLaMA

Opus 4.7 Released!

/u/awfulalexey / April 16, 2026

submitted by /u/awfulalexey [link] [comments]