AI RESEARCH

Personalized Group Relative Policy Optimization for Heterogenous Preference Alignment

Apple Machine Learning Research

Despite their sophisticated general-purpose capabilities, Large Language Models (LLMs) often fail to align with diverse individual preferences because standard post-