The field of Markov decision processes (MDPs) and Bayesian optimization is witnessing significant developments, with a focus on addressing complex problems with uncertain parameters and constraints. Researchers are exploring novel approaches to mitigate epistemic uncertainty, such as Bayesian-risk MDPs and coherent risk measures. Additionally, there is a growing interest in online and episodic settings, where algorithms must adapt to changing conditions and constraints. Noteworthy papers in this area include: Policy Gradient Optimzation for Bayesian-Risk MDPs with General Convex Losses, which proposes a policy gradient optimization method for Bayesian-risk MDPs. Beyond Slater's Condition in Online CMDPs with Stochastic and Adversarial Constraints, which provides a novel algorithm for online episodic Constrained MDPs with improved guarantees.