AI RESEARCH
SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests
Microsoft Research Blog
•
Using SocialReasoning Bench, we observed a stable pattern across models - agents execute competently, but fail to consistently improve the user’s position, even with explicit instructions to optimize for user interest.