Advances in Diffusion Models for Language and Vision

The field of diffusion models is rapidly evolving, with a focus on improving reasoning abilities and accelerating sampling processes. Recent developments have led to the creation of innovative frameworks and techniques that enhance the performance of diffusion language models and transformers. These advancements have resulted in significant improvements in tasks such as logical reasoning, math reasoning, and visual generation. Notably, researchers are exploring new policy gradient algorithms, reinforcement learning methods, and decoding strategies to optimize diffusion models. Noteworthy papers include: d2, which introduces a new policy gradient algorithm for masked diffusion language models, achieving state-of-the-art performance on logical and math reasoning tasks. RAPID^3, which proposes a tri-level reinforced acceleration policy for diffusion transformers, achieving nearly 3x faster sampling with competitive generation quality. Advantage Weighted Matching, which establishes a novel theoretical analysis and introduces a policy-gradient method for diffusion models, yielding substantial benefits in terms of speedup and convergence.

Sources

d2: Improved Techniques for Training Reasoning Diffusion Language Models

RAPID^3: Tri-Level Reinforced Acceleration Policies for Diffusion Transformer

Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Step

Advantage Weighted Matching: Aligning RL with Pretraining in Diffusion Models

OPPO: Accelerating PPO-based RLHF via Pipeline Overlap

DiFFPO: Training Diffusion LLMs to Reason Fast and Furious via Reinforcement Learning

Built with on top of