Efficient Adaptation of Large Language Models

The field of large language models is moving towards more efficient adaptation techniques, focusing on reducing computational costs and memory usage while maintaining performance. Researchers are exploring innovative methods to finetune these models, including orthogonal finetuning, dual sparsity, and progressive fine-tuning frameworks. These approaches aim to overcome the limitations of traditional finetuning methods, which can be resource-intensive and prone to catastrophic forgetting. Notable papers in this area include Orthogonal Finetuning Made Scalable, which proposes a scalable orthogonal finetuning method that reduces computational costs and memory usage, and DuoGPT, which introduces a unified framework for dual sparsity in large language models. Additionally, Progtuning and Pay Attention to Small Weights present novel fine-tuning frameworks that optimize resource allocation and reduce the number of updated parameters, leading to improved performance and efficiency.

Sources

Orthogonal Finetuning Made Scalable

DuoGPT: Training-free Dual Sparsity through Activation-aware Pruning in LLMs

GPTailor: Large Language Model Pruning Through Layer Cutting and Stitching

Progtuning: Progressive Fine-tuning Framework for Transformer-based Language Models

Pay Attention to Small Weights

Built with on top of