cs.AI, cs.LG, stat.ML

Transformer Approximations from ReLUs

arXiv:2604.24878v1 Announce Type: new
Abstract: We provide a systematic recipe for translating ReLU approximation results to softmax attention mechanism. This recipe covers many common approximation targets. Importantly, it yields target-specific, eco…