Llama-server: is it bleeding to CPU/RAM?
Is there an easy way to know if a model is using CPU/RAM (and not only GPU/VRAM)? (I think standard verbose output, which got shorter, says nothing about this, but I may be missing something) submitted by /u/jopereira [link] [comm…