ClaHF: A Human Feedback-inspired Reinforcement Learning Framework for Improving Classification Tasks

ArXi:2605.17458v1 Announce Type: new Text classification models are typically trained via supervised fine-tuning (SFT). However, SFT essentially performs behavior cloning from instance-wise labels and thus fails to adequately capture relative preference relations among samples, which limits the model's ability to shape decision boundaries and calibrate predictive confidence.