Continual Learning and Multi-Task Expert Models

The field of continual learning and multi-task learning is moving towards the development of more efficient and effective expert-based models. Researchers are exploring novel architectures and techniques to mitigate knowledge interference, improve expert specialization, and enhance overall performance. A key direction is the integration of mixture of experts (MoE) frameworks with sparse and adaptive mechanisms, allowing for more flexible and dynamic expert selection. Another area of focus is the development of robust and generalizable models that can handle limited data and prevent catastrophic forgetting. Noteworthy papers include: One-Prompt Strikes Back, which proposes a sparse MoE architecture for prompt-based continual learning, achieving state-of-the-art performance while reducing parameter counts and computational costs. LEAF, a robust expert-based framework for few-shot continual event detection, which integrates a specialized MoE architecture and semantic-aware expert selection mechanism to achieve state-of-the-art performance. Adaptive Shared Experts, which proposes a LoRA-based MoE framework for multi-task learning, facilitating the transition from single-task to multi-task learning and enhancing expert specialization and cooperation. Dirichlet-Prior Shaping, which introduces a novel router regularization technique to guide expert specialization in upcycled MoEs, offering fine-grained control over expert balance and specialization.

Continual Learning and Multi-Task Expert Models

Sources