AI RESEARCH
But what is your honest answer? Aiding LLM-judges with honest alternatives using steering vectors
arXiv CS.AI
•
ArXi:2505.17760v3 Announce Type: replace-cross LLM-as-a-judge is widely used as a scalable substitute for human evaluation, yet current approaches rely on black-box access and struggle to detect subtle dishonesty, such as sycophancy and manipulation. We