Advances in Diffusion Models and Reinforcement Learning

The field of reinforcement learning and diffusion models is rapidly evolving, with a focus on improving scalability, controllability, and efficiency. Recent research has explored the application of diffusion models to offline reinforcement learning, imitation learning, and multi-agent systems. Notable advancements include the development of novel algorithms that leverage evolutionary search, classifier guidance, and normalizing flows to enhance the performance and flexibility of diffusion models. Additionally, there is a growing interest in integrating classical search algorithms with diffusion models to enable inference-time scaling and control. Overall, the field is moving towards more efficient, expressive, and generalizable models that can be applied to a wide range of tasks and domains. Noteworthy papers include: EvoSearch, which proposes a novel test-time scaling method for image and video generation, and Normalizing Flows are Capable Models for RL, which demonstrates the potential of normalizing flows in reinforcement learning.

Sources

Scaling Image and Video Generation via Test-Time Evolutionary Search

What Do You Need for Diverse Trajectory Stitching in Diffusion Planning?

Efficient Controllable Diffusion via Optimal Classifier Guidance

Hierarchical Reinforcement Learning with Uncertainty-Guided Diffusional Subgoals

Streaming Flow Policy: Simplifying diffusion$/$flow-matching policies by treating action trajectories as flow trajectories

Reward-Independent Messaging for Decentralized Multi-Agent Reinforcement Learning

ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning

Oryx: a Performant and Scalable Algorithm for Many-Agent Coordination in Offline MARL

Scaling Offline RL via Efficient and Expressive Shortcut Models

Equivalence of stochastic and deterministic policy gradients

MEF-Explore: Communication-Constrained Multi-Robot Entropy-Field-Based Exploration

Enhanced DACER Algorithm with High Diffusion Efficiency

Diffusion Guidance Is a Controllable Policy Improvement Operator

Normalizing Flows are Capable Models for RL

Inference-time Scaling of Diffusion Models through Classical Search