Efficient Fine-Tuning of Large Language Models

The field of Large Language Models (LLMs) is moving towards more efficient fine-tuning methods, with a focus on reducing computational costs and improving model performance. Recent developments have centered around Low-Rank Adaptation (LoRA) methods, which have shown promise in adapting LLMs to specific downstream tasks while minimizing parameter updates. Noteworthy papers in this area include:

  • The paper proposing EffiLoRA, which employs a unified A matrix across all transformer layers and introduces a runtime selective B matrices update to dynamically trade-off system resource budget and model performance.
  • The paper proposing SmartFed, which intelligently reuses knowledge embedded in existing LoRA modules and introduces the Mixture of Rank-Wise Experts (MoRE) to selectively activate and combine experts based on input semantics and resource budgets.

Sources

Less is More: Resource-Efficient Low-Rank Adaptation

Elastic Mixture of Rank-Wise Experts for Knowledge Reuse in Federated Fine-Tuning

Dual LoRA: Enhancing LoRA with Magnitude and Direction Updates

Parameter-Efficient Augment Plugin for Class-Incremental Learning

How (Mis)calibrated is Your Federated CLIP and What To Do About It?

Built with on top of