cs.AI, cs.DC

Rocks, Pebbles and Sand: Modality-aware Scheduling for Multimodal Large Language Model Inference

arXiv:2603.26498v1 Announce Type: cross
Abstract: Multimodal Large Language Models (MLLMs) power platforms like ChatGPT, Gemini, and Copilot, enabling richer interactions with text, images, and videos. These heterogeneous workloads introduce additiona…