cs.LG, cs.PL

Nautilus: An Auto-Scheduling Tensor Compiler for Efficient Tiled GPU Kernels

arXiv:2604.14825v1 Announce Type: cross
Abstract: We present Nautilus, a novel tensor compiler that moves toward fully automated math-to-kernel optimization. Nautilus compiles a high-level algebraic specification of tensor operators into efficient til…