Guandong Li - Provide.ai

LayerCache: Exploiting Layer-wise Velocity Heterogeneity for Efficient Flow Matching Inference

Guandong Li / April 21, 2026

arXiv:2604.16492v1 Announce Type: new
Abstract: Flow Matching models achieve state-of-the-art image generation quality but incur substantial inference cost due to iterative denoising through large Transformer networks. We observe that different layer …

Author name: Guandong Li

LayerCache: Exploiting Layer-wise Velocity Heterogeneity for Efficient Flow Matching Inference