Advances in Model Merging and Fine-Tuning

The field of model merging and fine-tuning is rapidly advancing, with a focus on developing more efficient and effective methods for combining multiple models and adapting them to new tasks. Recent research has highlighted the importance of understanding the theoretical foundations of model merging, including the connection between task vectors and gradients. Additionally, there is a growing interest in developing methods that can preserve domain generalization and adapt to new tasks while minimizing the need for extensive re-training. Noteworthy papers in this area include On Task Vectors and Gradients, which provides a rigorous theoretical foundation for task arithmetic, and Competition and Attraction Improve Model Fusion, which proposes a novel evolutionary algorithm for model merging. Other notable papers include GEM: A Scale-Aware and Distribution-Sensitive Sparse Fine-Tuning Framework, which introduces a parameter scale-aware and distribution-sensitive sparse fine-tuning framework, and Preserving Domain Generalization in Fine-Tuning via Joint Parameter Selection, which proposes a method for restricting updates to a small, sparse subset of parameters to retain and harness the generalization strength of pre-trained models.

Sources

On Task Vectors and Gradients

Competition and Attraction Improve Model Fusion

GEM: A Scale-Aware and Distribution-Sensitive Sparse Fine-Tuning Framework for Effective Downstream Adaptation

Preserving Domain Generalization in Fine-Tuning via Joint Parameter Selection

Not Just for Archiving: Provable Benefits of Reusing the Archive in Evolutionary Multi-objective Optimization

Efficient Multi-Source Knowledge Transfer by Model Merging

PSO-Merging: Merging Models Based on Particle Swarm Optimization

Self-Supervised Pre-Training with Equilibrium Constraints

Supervised Stochastic Gradient Algorithms for Multi-Trial Source Separation

Built with on top of