The field of diffusion language models is moving towards improving reasoning performance and parallel token sampling. Researchers are exploring new architectures and techniques to enhance token correlation and capture dependencies among tokens. Notable developments include the use of multi-reward optimization, conditional independence testing, and hybrid discrete-continuous diffusion models. These advancements have led to significant improvements in generation quality, inference speed, and controllability.
Some noteworthy papers in this area include: MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization, which proposes a multi-reward optimization approach to improve reasoning performance. Parallel Sampling from Masked Diffusion Models via Conditional Independence Testing, which presents a model-agnostic sampler that reconciles the trade-off between conditional independence and confidence criteria. CANDI: Hybrid Discrete-Continuous Diffusion Models, which introduces a hybrid framework that decouples discrete and continuous corruption, enabling simultaneous learning of both conditional structure and continuous geometry.