Beyond Offline A/B Testing: Context-Aware Agent Simulation for Recommender System Evaluation

ArXi:2604.09549v1 Announce Type: cross Recommender systems are central to online services, enabling users to navigate through massive amounts of content across various domains. However, their evaluation remains challenging due to the disconnect between offline metrics and online performance. The emergence of Large Language Model-powered agents offers a promising solution, yet existing studies model users in isolation, neglecting the contextual factors such as time, location, and needs, which fundamentally shape human decision-making. In this paper, we