Moonshot open-sourced FlashKDA, CUTLASS kernels for Kimi Delta Attention, up to 2.22x over the Triton baseline on H20
github.com/MoonshotAI/FlashKDA Been comparing how different routing layers handle K2.6 this week, OpenRouter, Together, Orq, and while digging around I came across FlashKDA which Moonshot dropped alongside the K2.6 activity. Seems to be flying un…