Llama-server: is it bleeding to CPU/RAM?

By /u/jopereira / May 18, 2026

Is there an easy way to know if a model is using CPU/RAM (and not only GPU/VRAM)?
(I think standard verbose output, which got shorter, says nothing about this, but I may be missing something)

submitted by /u/jopereira
[link] [comments]

Leave a Comment