AI RESEARCH

YFPO: A Preliminary Study of Yoked Feature Preference Optimization with Neuron-Guided Rewards for Mathematical Reasoning

arXiv CS.CL

ArXi:2605.11906v1 Announce Type: new Preference optimization has become an important post-