cs.LG

Flow-Controlled Scheduling for LLM Inference with Provable Stability Guarantees

arXiv:2604.11001v1 Announce Type: new
Abstract: Large language models (LLMs) have been widely adopted due to their great performance across a wide range of applications. ChatGPT and Gemini now serve hundreds of millions of active users and handle bill…