I’ve updated my glorified Llama fork (LLM Inference Server) for P40’s to utilise MTP + TurboQuant + DFlashBy /u/Sakatard / May 16, 2026 submitted by /u/Sakatard [link] [comments]