Advances in Neural Combinatorial Optimization and Reinforcement Learning

The field of neural combinatorial optimization and reinforcement learning is moving towards tackling increasingly complex tasks and large-scale problem instances. Researchers are exploring novel frameworks and techniques to improve the scalability and generalization ability of models, such as mixture-of-expert decision transformers and test-time projection learning. Additionally, there is a growing focus on developing more efficient and interpretable multi-objective reinforcement learning methods. Noteworthy papers include:

  • Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision Transformer, which proposes a novel framework for scaling to extremely massive tasks.
  • Improving Generalization of Neural Combinatorial Optimization for Vehicle Routing Problems via Test-Time Projection Learning, which introduces a learning framework driven by Large Language Models to enhance scalability.
  • MTL-KD: Multi-Task Learning Via Knowledge Distillation for Generalizable Neural Vehicle Routing Solver, which enables the efficient training of heavy decoder models with strong generalization ability.
  • Interpretability by Design for Efficient Multi-Objective Reinforcement Learning, which provides an effective search within contiguous solution domains using a locally linear map between the parameter space and the performance space.

Sources

Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision Transformer

Rethinking Neural Combinatorial Optimization for Vehicle Routing Problems with Different Constraint Tightness Degrees

Improving Generalization of Neural Combinatorial Optimization for Vehicle Routing Problems via Test-Time Projection Learning

MTL-KD: Multi-Task Learning Via Knowledge Distillation for Generalizable Neural Vehicle Routing Solver

Interpretability by Design for Efficient Multi-Objective Reinforcement Learning

Understanding the Impact of Sampling Quality in Direct Preference Optimization

Built with on top of