Advances in Robust Reinforcement Learning and Multi-Agent Systems

The field of reinforcement learning and multi-agent systems is moving towards developing more robust and resilient methods that can handle uncertainty and partial observability. Researchers are focusing on creating benchmarks and evaluation metrics that can accurately assess the performance of algorithms in real-world scenarios. There is also a growing interest in developing methods that can learn from interactions with unknown environments and adapt to changing conditions. Notably, distributionally robust reinforcement learning and multi-agent systems are being explored to optimize worst-case performance over uncertainty sets. Some papers are making significant contributions to the field, including the introduction of new benchmarks and evaluation metrics for partial observability, and the development of novel methods for solving and learning robust Markov decision processes. For example, the paper on 'Benchmarking Partial Observability in Reinforcement Learning' introduces a suite of memory-improvable domains to gauge progress in mitigating partial observability, while the paper on 'Efficient Solution and Learning of Robust Factored MDPs' proposes novel methods for solving and learning robust factored MDPs. The paper on 'Online Robust Multi-Agent Reinforcement Learning under Model Uncertainties' pioneers the study of online learning in distributionally robust Markov games, providing the first provable guarantees for this setting.

Sources

Benchmarking Partial Observability in Reinforcement Learning with a Suite of Memory-Improvable Domains

Efficient Solution and Learning of Robust Factored MDPs

Exponential convergence rate for Iterative Markovian Fitting

Uncertainty Sets for Distributionally Robust Bandits Using Structural Equation Models

Online Robust Multi-Agent Reinforcement Learning under Model Uncertainties

Distributionally Robust Markov Games with Average Reward

Approximate Proportionality in Online Fair Division

Provably Near-Optimal Distributionally Robust Reinforcement Learning in Online Settings

Mechanism Design for Facility Location using Predictions

Synthetic POMDPs to Challenge Memory-Augmented RL: Memory Demand Structure Modeling

Online EFX Allocations with Predictions

Probabilistic Alternating Simulations for Policy Synthesis in Uncertain Stochastic Dynamical Systems

Domain-driven Metrics for Reinforcement Learning: A Case Study on Epidemic Control using Agent-based Simulation

Tail-Risk-Safe Monte Carlo Tree Search under PAC-Level Guarantees

Built with on top of