Beyond Decodability: Reconstructing Language Model Representations with an Encoding Probe
arXiv:2605.00607v1 Announce Type: new
Abstract: Probing is widely used to study which features can be decoded from language model representations. However, the common decoding probe approach has two limitations that we aim to solve with our new encodi…