Mimo2.5 (not pro) under llama.cpp? – primary model opencoder?

I tried running AesSedai/MiMo-2.5-GGUF:Q4-K-M under llama.cpp (main tree, compiled 36hours ago)

Hardware: nvidia A6000 with 48GB RAM + 300GB CPU RAM

I had no success: error loading model: missing tensor blk.0.attn_q.weight ...
Is Mimo already supported under llama.cpp?
From what I read I guessed it runs but is not performnace tweaked yet.

Any hints what I did wrong?

We started using opencoder.
Our primary model is qwen3.6-27b-q8_0 at the moment.
Since qwen3.6-122B is not coming I wanted to test alternatives that can be used on the hardware mentioned or on a cluster of 2 x strix or 2 x dgx.
Mimo2.5 looks like outperforming 3.6-27b.
Even when we get useful code from 27b my naive belief is, that the quality of the primary model makes a big different. That's why am looking for the best available model for my hardware. Speed is not that important since the tasks can run overnight.
I am curious what others are using as locally hosted primary model?

submitted by /u/Impossible_Art9151
[link] [comments]

Leave a Comment