cs.AR, cs.LG

Hardware-Efficient Softmax and Layer Normalization with Guaranteed Normalization for Edge Devices

arXiv:2604.23647v1 Announce Type: cross
Abstract: In Transformer models, non-GEMM (non-General Matrix Multiplication) operations — especially Softmax and Layer Normalization (LayerNorm) — often dominate hardware cost due to their nonlinear nature. T…