cs.CV, cs.DC, cs.LG

LP-GEMM: Integrating Layout Propagation into GEMM Operations

arXiv:2604.04599v1 Announce Type: cross
Abstract: In Scientific Computing and modern Machine Learning (ML) workloads, sequences of dependent General Matrix Multiplications (GEMMs) often dominate execution time. While state-of-the-art BLAS libraries ag…