AI RESEARCH
YFPO: A Preliminary Study of Yoked Feature Preference Optimization with Neuron-Guided Rewards for Mathematical Reasoning
arXiv CS.CL
•
ArXi:2605.11906v1 Announce Type: new Preference optimization has become an important post-