AI RESEARCH

Random Is Hard to Beat: Active Selection in online DPO with Modern LLMs

arXiv CS.AI • April 06, 2026

ArXi:2604.02766v1 Announce Type: cross Modern LLMs inherit strong priors from web-scale pre