Author name: Ayoub Hammal, Pierre Zweigenbaum, Caio Corro

On the Rejection Criterion for Proxy-based Test-time Alignment

Ayoub Hammal, Pierre Zweigenbaum, Caio Corro / April 21, 2026

arXiv:2604.16146v2 Announce Type: replace
Abstract: Recent works proposed test-time alignment methods that rely on a small aligned model as a proxy that guides the generation of a larger base (unaligned) model. The implicit reward approach skews the l…

cs.CL

On the Rejection Criterion for Proxy-based Test-time Alignment

Ayoub Hammal, Pierre Zweigenbaum, Caio Corro / April 20, 2026

arXiv:2604.16146v1 Announce Type: new
Abstract: Recent works proposed test-time alignment methods that rely on a small aligned model as a proxy that guides the generation of a larger base (unaligned) model. The implicit reward approach skews the large…