AMD Hipfire – a new inference engine optimized for AMD GPU’s

Came across hipfire the other day. It's a brand new inference engine focused on all AMD GPU's (not just the latest).

It uses a special mq4 quantization method. The hipfire creator is pumping out models on huggingface.

I don't know enough about quantization to know how good these quants are in terms of quality, but as an RDNA3 aficionado I'm happy AMD is getting some attention.

Localmaxxing is a new LLM benchmarking site, and shows some pretty dramatic speedups for hipfire inference.

Edit: I should have just said hipfire - I don't think this is connected to AMD officially.

submitted by /u/Thrumpwart
[link] [comments]

Leave a Comment