Advances in Efficient Fine-Tuning of Large Language Models

The field of large language models is moving towards more efficient fine-tuning methods, with a focus on reducing computational resources and improving performance. Recent developments have led to the proposal of novel frameworks, such as self-learning approaches and low-rank adaptation methods, which enable more effective adaptation of language models to specific domains and tasks. These methods have shown promising results in various applications, including natural language processing and time series forecasting. Notably, researchers are exploring ways to overcome the expressiveness bottleneck in multi-task forecasting and to improve the efficiency of fine-tuning language models on multiple datasets. Some papers have also investigated the impact of data mixing on knowledge acquisition and the importance of resolving knowledge conflicts in domain-specific data selection. Overall, the field is witnessing a shift towards more efficient, effective, and scalable fine-tuning methods. Noteworthy papers include:

  • SLearnLLM, which proposes a self-learning framework for efficient domain-specific adaptation of large language models.
  • MoRE, which introduces a novel mixture of low-rank experts for adaptive multi-task learning.

Sources

SLearnLLM: A Self-Learning Framework for Efficient Domain-Specific Adaptation of Large Language Models

Mixture of Low Rank Adaptation with Partial Parameter Sharing for Time Series Forecasting

Data Mixing Can Induce Phase Transitions in Knowledge Acquisition

Rethinking Data Mixture for Large Language Models: A Comprehensive Survey and New Perspectives

Efficient Ensemble for Fine-tuning Language Models on Multiple Datasets

Resolving Knowledge Conflicts in Domain-specific Data Selection: A Case Study on Medical Instruction-tuning

Stratified Selective Sampling for Instruction Tuning with Dedicated Scoring Strategy

MoRE: A Mixture of Low-Rank Experts for Adaptive Multi-Task Learning

Unraveling LoRA Interference: Orthogonal Subspaces for Robust Model Merging

Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking

Two Is Better Than One: Rotations Scale LoRAs

Built with on top of