How does the optimizer implicitly bias the model merging loss landscape?
arXiv:2510.04686v2 Announce Type: replace
Abstract: Model merging combines independent solutions with different capabilities into a single one while maintaining the same inference cost. Two popular approaches are linear interpolation, which simply ave…