LocalLLaMA

Analysis of spilling MoE weights onto SSD: GLM-5 is surprisingly usable even with over 1/3rd of weights left on SSD, due to caching dynamics

submitted by /u/SillyHats [link] [comments]