SynerDiff: Synergetic Continuous Batching for Fast and Parallel Diffusion Model Inference
arXiv:2605.08835v1 Announce Type: new
Abstract: The expansion of Artificial Intelligence-generated content service requires diffusion model serving to simultaneously achieve high throughput and low task end-to-end (E2E) latency. However, existing cont…