Leon Eshuijs, Archie Chaudhury, Alan McBeth, Ethan Nguyen

But what is your honest answer? Aiding LLM-judges with honest alternatives using steering vectors

Leon Eshuijs, Archie Chaudhury, Alan McBeth, Ethan Nguyen / April 2, 2026

arXiv:2505.17760v3 Announce Type: replace-cross
Abstract: LLM-as-a-judge is widely used as a scalable substitute for human evaluation, yet current approaches rely on black-box access and struggle to detect subtle dishonesty, such as sycophancy and man…

Author name: Leon Eshuijs, Archie Chaudhury, Alan McBeth, Ethan Nguyen

But what is your honest answer? Aiding LLM-judges with honest alternatives using steering vectors