Robustness and Safety in Reinforcement Learning and Control

The field of reinforcement learning and control is moving towards developing more robust and safe methods for real-world applications. Recent research has focused on addressing the challenges of uncertainty, exploration, and safety in complex systems. One of the key directions is the development of distributionally robust reinforcement learning methods, which can handle uncertainties in transition dynamics and provide guarantees on performance. Another important area is the development of safe reinforcement learning methods, which can ensure that the system stays within safe bounds and avoids undesirable outcomes. The use of conformal prediction, probabilistic safety guarantees, and robust policy synthesis are some of the innovative approaches being explored. Noteworthy papers in this area include: Sample Complexity of Distributionally Robust Off-Dynamics Reinforcement Learning with Online Interaction, which proposes a novel algorithm for online learning in robust Markov decision processes. Provably Efficient Sample Complexity for Robust CMDP, which establishes a sample complexity guarantee for robust constrained Markov decision processes. Statistically Assuring Safety of Control Systems using Ensembles of Safety Filters and Conformal Prediction, which introduces a conformal prediction-based framework for providing probabilistic safety guarantees.

Sources

Sample Complexity of Distributionally Robust Off-Dynamics Reinforcement Learning with Online Interaction

Provably Efficient Sample Complexity for Robust CMDP

Statistically Assuring Safety of Control Systems using Ensembles of Safety Filters and Conformal Prediction

Balance Equation-based Distributionally Robust Offline Imitation Learning

Constrained and Robust Policy Synthesis with Satisfiability-Modulo-Probabilistic-Model-Checking

SafeMIL: Learning Offline Safe Imitation Policy from Non-Preferred Trajectories

Computable Characterisations of Scaled Relative Graphs of Closed Operators

Probabilistic Safety Guarantee for Stochastic Control Systems Using Average Reward MDPs

Safe and Optimal Learning from Preferences via Weighted Temporal Logic with Applications in Robotics and Formula 1

ATOM-CBF: Adaptive Safe Perception-Based Control under Out-of-Distribution Measurements

Good-for-MDP State Reduction for Stochastic LTL Planning

Runtime Safety and Reach-avoid Prediction of Stochastic Systems via Observation-aware Barrier Functions

Prophet and Secretary at the Same Time

Built with on top of