Advances in Parameter-Efficient Fine-Tuning for Large Language Models

The field of natural language processing is witnessing significant advancements in parameter-efficient fine-tuning (PEFT) methods for large language models. Researchers are focusing on developing innovative techniques to adapt these models to various downstream tasks efficiently. A key direction in this area is the use of low-rank adaptation methods, which have shown promising results in reducing computational costs while maintaining performance. Another notable trend is the exploration of orthogonal fine-tuning methods, which offer improved generalization properties despite being less time and memory efficient. Furthermore, researchers are investigating the application of matrix approximation techniques, such as hierarchically semi-separable matrix approximation, to improve the efficiency of PEFT methods. Additionally, there is a growing interest in developing methods for continual learning and multilingual model compression, which can enable more effective deployment of large language models in real-world applications. Noteworthy papers in this area include Memory-Efficient Orthogonal Fine-Tuning with Principal Subspace Adaptation, which proposes a novel approach to reduce the memory footprint of orthogonal fine-tuning, and LoRASuite, which introduces a modular approach for efficient LoRA adaptation across large language model upgrades. Quasi-optimal hierarchically semi-separable matrix approximation also presents a significant contribution by providing a randomized algorithm for producing a quasi-optimal hierarchically semi-separable matrix approximation.

Sources

Memory-Efficient Orthogonal Fine-Tuning with Principal Subspace Adaptation

LoRASuite: Efficient LoRA Adaptation Across Large Language Model Upgrades

Improved Methods for Model Pruning and Knowledge Distillation

ABBA: Highly Expressive Hadamard Product Adaptation for Large Language Models

Dual Decomposition of Weights and Singular Value Low Rank Adaptation

OSoRA: Output-Dimension and Singular-Value Initialized Low-Rank Adaptation

Enhancing Learned Knowledge in LoRA Adapters Through Efficient Contrastive Decoding on Ascend NPUs

Saten: Sparse Augmented Tensor Networks for Post-Training Compression of Large Language Models

Gated Integration of Low-Rank Adaptation for Continual Learning of Language Models

CoLA: Collaborative Low-Rank Adaptation

HOFT: Householder Orthogonal Fine-tuning

Quasi-optimal hierarchically semi-separable matrix approximation

On Multilingual Encoder Language Model Compression for Low-Resource Languages

Built with on top of