Advancements in Reinforcement Learning and Optimization

The field of reinforcement learning and optimization is moving towards more efficient and scalable methods. Researchers are exploring new techniques to improve sample efficiency, such as hindsight regularization and reparameterization policy gradients. Additionally, there is a growing interest in multi-objective reinforcement learning and generalized planning using graph neural networks.

Noteworthy papers include: GCHR, which proposes a new technique for sample-efficient reinforcement learning using hindsight goal-conditioned regularization. Reparameterization Proximal Policy Optimization, which establishes a connection between reparameterization policy gradients and proximal policy optimization, enabling stable and sample-efficient training. ParBalans, which introduces a parallel multi-armed bandits-based adaptive large neighborhood search for mixed-integer programming problems, achieving competitive performance compared to state-of-the-art commercial solvers. GDBA Revisited, which proposes a novel guided local search framework for distributed constraint optimization problems, demonstrating great superiority over state-of-the-art baselines. Variance Reduced Policy Gradient Method for Multi-Objective Reinforcement Learning, which addresses the issue of sample efficiency in multi-objective reinforcement learning by implementing variance-reduction techniques. Scaling Up without Fading Out, which proposes a sparse, goal-aware graph neural network representation for generalized planning, effectively scaling to larger grid sizes and improving policy generalization and success rates.

Sources

GCHR : Goal-Conditioned Hindsight Regularization for Sample-Efficient Reinforcement Learning

Reparameterization Proximal Policy Optimization

ParBalans: Parallel Multi-Armed Bandits-based Adaptive Large Neighborhood Search

GDBA Revisited: Unleashing the Power of Guided Local Search for Distributed Constraint Optimization

Energy and Quality of Surrogate-Assisted Search Algorithms: a First Analysis

An Efficient Application of Goal Programming to Tackle Multiobjective Problems with Recurring Fitness Landscapes

A Fast GRASP Metaheuristic for the Trigger Arc TSP with MIP-Based Construction and Multi-Neighborhood Local Search

Variance Reduced Policy Gradient Method for Multi-Objective Reinforcement Learning

Scaling Up without Fading Out: Goal-Aware Sparse GNN for RL-based Generalized Planning

Spirals and Beyond: Competitive Plane Search with Multi-Speed Agents

Built with on top of