AI RESEARCH
Towards Disentangled Preference Optimization Dynamics Beyond Likelihood Displacement
arXiv CS.LG
•
ArXi:2604.18239v1 Announce Type: new Preference optimization is widely used to align large language models (LLMs) with human preferences. However, many margin-based objectives suppress the chosen response along with the rejected one, a phenomenon known as likelihood displacement, and no general mechanism currently prevents this across objectives. We bridge this gap by presenting a unified \emph{incentive-score decomposition} of preference optimization, revealing that diverse objectives share identical local update directions and differ only in their scalar weighting coefficients.