AI RESEARCH
How does the optimizer implicitly bias the model merging loss landscape?
arXiv CS.AI
•
ArXi:2510.04686v2 Announce Type: replace-cross Model merging combines independent solutions with different capabilities into a single one while maintaining the same inference cost. Two popular approaches are linear interpolation, which simply averages multiple model weights, and task arithmetic, which combines task vectors obtained by the difference between finetuned and base models. While useful in practice, what properties make merging effective are poorly understood. This paper explores how the optimization dynamics affect the loss landscape geometry and its impact on merging success.