Models Know Their Shortcuts: Deployment-Time Shortcut Mitigation
arXiv:2604.12277v1 Announce Type: new
Abstract: Pretrained language models often rely on superficial features that appear predictive during training yet fail to generalize at test time, a phenomenon known as shortcut learning. Existing mitigation meth…