cs.AI, cs.CV

LayerCache: Exploiting Layer-wise Velocity Heterogeneity for Efficient Flow Matching Inference

arXiv:2604.16492v1 Announce Type: new
Abstract: Flow Matching models achieve state-of-the-art image generation quality but incur substantial inference cost due to iterative denoising through large Transformer networks. We observe that different layer …