Vyom Sharma, Debajyoti Datta

RaMP: Runtime-Aware Megakernel Polymorphism for Mixture-of-Experts

Vyom Sharma, Debajyoti Datta / April 30, 2026

arXiv:2604.26039v1 Announce Type: cross
Abstract: The optimal kernel configuration for Mixture-of-Experts (MoE) inference depends on both batch size and the expert routing distribution, yet production systems dispatch from batch size alone, leaving 10…

Author name: Vyom Sharma, Debajyoti Datta

RaMP: Runtime-Aware Megakernel Polymorphism for Mixture-of-Experts