[D] Hash table aspects of ReLU neural networks
If you collect the ReLU decisions into a diagonal matrix with 0 or 1 entries then a ReLU layer is DWx, where W is the weight matrix and x the input. What then is Wₙ₊₁Dₙ where Wₙ₊₁ is the matrix of weights for the next layer? It can be seen as a (locali…