One Adapts to Any: Meta Reward Modeling for Personalized LLM Alignment

ArXi:2601.18731v2 Announce Type: replace Alignment of Large Language Models (LLMs) aims to align outputs with human preferences, and personalized alignment further adapts models to individual users. This relies on personalized reward models that capture user-specific preferences and automatically provide individualized feedback. However, developing these models faces two critical challenges: the scarcity of feedback from individual users and the need for efficient adaptation to unseen users.