Advances in Robust Reinforcement Learning and Markov Decision Processes

The field of reinforcement learning and Markov decision processes is moving towards developing more robust and efficient algorithms. Recent research has focused on improving the robustness of policies under uncertainty and adversity, with a particular emphasis on developing algorithms that can handle epistemic uncertainty in environment dynamics. Another key area of research is the development of more efficient algorithms for solving Markov decision processes, including the use of homomorphic mappings and adaptive low-rank structures. Additionally, there is a growing interest in developing algorithms that can learn to guide planning in partially observable Markov decision processes. Noteworthy papers include: ADARL, which proposes a bi-level optimization framework that improves robustness by aligning policy complexity with the intrinsic dimension of the task. Pruning Cannot Hurt Robustness, which develops the first theoretical framework for certified robustness under pruning in state-adversarial Markov decision processes. GammaZero, which introduces an action-centric graph representation framework for learning to guide planning in partially observable Markov decision processes.

Sources

Homomorphic Mappings for Value-Preserving State Aggregation in Markov Decision Processes

ADARL: Adaptive Low-Rank Structures for Robust Policy Learning under Uncertainty

Pruning Cannot Hurt Robustness: Certified Trade-offs in Reinforcement Learning

Achieving Logarithmic Regret in KL-Regularized Zero-Sum Markov Games

Towards Blackwell Optimality: Bellman Optimality Is All You Can Get

Asymptotically optimal reinforcement learning in Block Markov Decision Processes

GammaZero: Learning To Guide POMDP Belief Space Search With Graph Representations

Policy Regularized Distributionally Robust Markov Decision Processes with Linear Function Approximation

Active Measuring in Reinforcement Learning With Delayed Negative Effects

Built with on top of