Efficient Fine-Tuning of Large Language Models

The field of large language models is moving towards more efficient fine-tuning methods, with a focus on reducing the computational costs and memory usage associated with traditional fine-tuning approaches. Recent developments have explored the use of low-rank adaptation, parameter-efficient fine-tuning, and subspace-constrained methods to achieve this goal. These approaches have shown promising results in terms of performance and efficiency, and are likely to play a key role in the future development of large language models. Notable papers in this area include: SVD-Free Low-Rank Adaptive Gradient Optimization for Large Language Models, which proposes a novel method for approximating SVD-based gradient projections. Compressing Sine-Activated Low-Rank Adapters through Post-Training Quantization, which demonstrates the effectiveness of applying a sinusoidal transformation to low-rank adapters in the context of post-training quantization. MAP: Revisiting Weight Decomposition for Low-Rank Adaptation, which introduces a novel framework for decoupling weight adaptation into direction and magnitude components. SC-LoRA: Balancing Efficient Fine-tuning and Knowledge Preservation via Subspace-Constrained LoRA, which proposes a subspace-constrained LoRA initialization framework for navigating the trade-off between efficient fine-tuning and knowledge preservation.

Sources

SVD-Free Low-Rank Adaptive Gradient Optimization for Large Language Models

Compressing Sine-Activated Low-Rank Adapters through Post-Training Quantization

LaMM: Semi-Supervised Pre-Training of Large-Scale Materials Models

Update Your Transformer to the Latest Release: Re-Basin of Task Vectors

MAP: Revisiting Weight Decomposition for Low-Rank Adaptation

Weight Spectra Induced Efficient Model Adaptation

Beyond Zero Initialization: Investigating the Impact of Non-Zero Initialization on LoRA Fine-Tuning Dynamics

SC-LoRA: Balancing Efficient Fine-tuning and Knowledge Preservation via Subspace-Constrained LoRA

Built with on top of