/u/APFrisco - Provide.ai

Computer build using Intel Optane Persistent Memory – Can run 1 trillion parameter model at over 4 tokens/sec

/u/APFrisco / May 11, 2026

As the title states, my build is indeed able to run a 1 trillion parameter model (in this case Kimi K2.5) locally at ~4 tokens/second. I thought r/LocalLLaMA would be interested in the build due to that stat line, and also due to the inclusion of…

Author name: /u/APFrisco

Computer build using Intel Optane Persistent Memory – Can run 1 trillion parameter model at over 4 tokens/sec