Not sure if others have updated but tried the MPT version of LLAMA CPP. It works pretty good. I have a shitty IGPU AMD 64gb unified memory. It's pretty fast. Would say as fast as 9b Qwen 3.5 Q4KM replies. This is pretty cool.
[link] [comments]
Not sure if others have updated but tried the MPT version of LLAMA CPP. It works pretty good. I have a shitty IGPU AMD 64gb unified memory. It's pretty fast. Would say as fast as 9b Qwen 3.5 Q4KM replies. This is pretty cool.