| Hey everyone, I just got into local LLMs about a week ago. I tried Ollama and LMStudio on my Core Ultra 9 288V, but they kept failing or giving me "hard stops" on the MoE models, so I figured I’d just try building the environment myself. I couldn’t get OpenVINO to play nice with the NPU for these larger models yet, so I just compiled a custom Vulkan bridge for the GPU instead. It seems to be working? Performance Stats:
I also tried the 31B-it-i1-Q4_K_M.gguf version. It's a bit heavier but still totally usable:
Is this a normal result for integrated graphics? I only got it working on the CPU at first which was faster although unsustainable, but once the Vulkan bridge was built, it is balanced. I'm using CachyOS if that makes a difference. Just wanted to see if I’m missing something or if Intel Lunar Lake is actually this cracked for local MoE. [link] [comments] |