Depth Registers Unlock W4A4 on SwiGLU: A Reader/Generator Decomposition
arXiv:2604.18128v1 Announce Type: new
Abstract: We study post-training W4A4 quantization in a controlled 300M-parameter SwiGLU decoder-only language model trained on 5B tokens of FineWeb-Edu, and ask which input-activation sites dominate the error. Na…