Advancements in Domain Adaptation and Continual Learning for Large Language Models

The field of natural language processing is witnessing significant developments in domain adaptation and continual learning for large language models. Researchers are exploring innovative approaches to adapt these models to specialized domains and tasks, while preserving their robust semantic discrimination properties. One notable direction is the use of multi-stage frameworks that incorporate joint domain-specific masked supervision and contrastive objectives. Another area of focus is the development of novel mechanisms for dynamic modulation of representations, enabling more effective and efficient adaptation to multiple tasks. Furthermore, there is a growing interest in collaborative frameworks that leverage small models to assist large models, reducing computational resource consumption while maintaining comparable accuracy. Noteworthy papers in this regard include MOSAIC, which introduces a multi-stage framework for domain adaptation of sentence embedding models, and Localist LLMs with Recruitment Learning, which presents a novel framework for training large language models with continuously adjustable internal representations. Additionally, Contextual Attention Modulation and KORE are also making significant contributions to the field by proposing novel mechanisms for efficient multi-task adaptation and knowledge injection in large language models.

Sources

MOSAIC: Masked Objective with Selective Adaptation for In-domain Contrastive Learning

Localist LLMs with Recruitment Learning

Contextual Attention Modulation: Towards Efficient Multi-Task Adaptation in Large Language Models

KORE: Enhancing Knowledge Injection for Large Multimodal Models via Knowledge-Oriented Augmentations and Constraints

DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents

Adapting Multilingual Models to Code-Mixed Tasks via Model Merging

Are Greedy Task Orderings Better Than Random in Continual Linear Regression?

KCM: KAN-Based Collaboration Models Enhance Pretrained Large Models

IKnow: Instruction-Knowledge-Aware Continual Pretraining for Effective Domain Adaptation

RECALL: REpresentation-aligned Catastrophic-forgetting ALLeviation via Hierarchical Model Merging

Built with on top of