AI RESEARCH
Nexus: Same Pretraining Loss, Better Downstream Generalization via Common Minima
arXiv CS.LG
•
ArXi:2604.09258v1 Announce Type: new