But what is your honest answer? Aiding LLM-judges with honest alternatives using steering vectors
arXiv:2505.17760v3 Announce Type: replace-cross
Abstract: LLM-as-a-judge is widely used as a scalable substitute for human evaluation, yet current approaches rely on black-box access and struggle to detect subtle dishonesty, such as sycophancy and man…