Advances in Multimodal and Continual Learning

The field of artificial intelligence is witnessing significant advancements in multimodal and continual learning. Researchers are exploring new approaches to integrate multiple modalities, such as vision, language, and human motion, to enhance model performance and generalizability. One notable direction is the development of routing strategies that enable models to dynamically allocate experts or parameters based on input prompts or tasks, reducing catastrophic forgetting and improving adaptation to new domains. Another area of focus is the design of novel position encoding frameworks that preserve the inherent structures of each modality, leading to improved performance in vision-language models. Furthermore, researchers are investigating knowledge-guided prompt learning frameworks that leverage structured knowledge bases to enhance semantic representations and support reasoning in cross-domain recommendation tasks. Noteworthy papers in this area include Soft Task-Aware Routing of Experts for Equivariant Representation Learning, which introduces a routing strategy for projection heads that models them as experts, and KGBridge, a knowledge-guided prompt learning framework for cross-domain sequential recommendation. Additionally, papers like GNN-MoE and RoME demonstrate the effectiveness of graph-based contextual routing and domain-robust mixture-of-experts frameworks in achieving state-of-the-art performance in domain generalization and MILP solution prediction tasks.

Sources

Soft Task-Aware Routing of Experts for Equivariant Representation Learning

Panprediction: Optimal Predictions for Any Downstream Task and Loss

GraphKeeper: Graph Domain-Incremental Learning via Knowledge Disentanglement and Preservation

OMEGA: Optimized Multimodal Position Encoding Index Derivation with Global Adaptive Scaling for Vision-Language Models

HMVLM: Human Motion-Vision-Lanuage Model via MoE LoRA

A Soft-partitioned Semi-supervised Collaborative Transfer Learning Approach for Multi-Domain Recommendation

Bridging Lifelong and Multi-Task Representation Learning via Algorithm and Complexity Measure

Dynamic Routing Between Experts: A Data-Efficient Approach to Continual Learning in Vision-Language Models

Path-Coordinated Continual Learning with Neural Tangent Kernel-Justified Plasticity: A Theoretical Framework with Near State-of-the-Art Performance

KGBridge: Knowledge-Guided Prompt Learning for Non-overlapping Cross-Domain Recommendation

RoME: Domain-Robust Mixture-of-Experts for MILP Solution Prediction across Domains

GNN-MoE: Context-Aware Patch Routing using GNNs for Parameter-Efficient Domain Generalization

Built with on top of