Exponential Approximation Rates and Parameter Efficiency of Learnable Bernstein Activations
arXiv:2602.04264v2 Announce Type: replace-cross
Abstract: The choice of activation function fundamentally shapes the representational capacity and parameter efficiency of deep neural networks, yet most widely used activations lack rigorous theoretical…