I’ll take an open-model release over a closed SOTA any day, who’s with me?
submitted by /u/be566 [link] [comments]
submitted by /u/be566 [link] [comments]
Note: First is Qwen3.5 35B MoE (Left) and Second is Qwen3.6 (Right) Hi Guys Just did quick comparison of Qwen3.6 35B MoE against Qwen 3.5 35B MoE. with reasoning off using llama.cpp and same quant unsloth 4 K_XL GGUF First is Qwen3.5 outcome and…
This is my first test with this model and Qwen impressed me. I will rate it 98% usable web os compared to my previous best 70% usable result from qwen3 next coder at q2. Yes I know they train the models on these common prompts yet this is the best resu…
Just swapped Qwen 3.5 for the 3.6 variant (FP8, RTX 6000 Pro) using the same recommended generation settings. My stack is vLLM (v0.19.0) + Open WebUI (v0.8.12) in a RAG setup where the model has access to several document retrieval tools. After some i…
submitted by /u/Synthetic_Diva_4556 [link] [comments]
submitted by /u/fulgencio_batista [link] [comments]
Macrocosmos has released a paper on ResBM (Residual Bottleneck Models), a new transformer-based architecture designed for low-bandwidth pipeline-parallel training. https://arxiv.org/abs/2604.11947 ResBM introduces a residual encoder-decoder bottleneck …
submitted by /u/AppleBottmBeans [link] [comments]