LayerCache: Exploiting Layer-wise Velocity Heterogeneity for Efficient Flow Matching Inference
arXiv:2604.16492v1 Announce Type: new
Abstract: Flow Matching models achieve state-of-the-art image generation quality but incur substantial inference cost due to iterative denoising through large Transformer networks. We observe that different layer …