Advances in Diffusion Large Language Models

The field of natural language processing is witnessing a significant shift towards diffusion large language models (dLLMs), which offer advantages such as accelerated parallel decoding and bidirectional context modeling. Recent developments have focused on improving the efficiency and quality of dLLMs, with a particular emphasis on speculative decoding, diffusion-based drafting, and parallel decoding strategies. These innovations have led to substantial speedup and quality improvements, making dLLMs a competitive alternative to autoregressive models. Notable papers in this area include SelfJudge, which proposes a self-supervised judge verification method for speculative decoding, and DiffuSpec, which introduces a diffusion-based drafting framework for parallel decoding. Additionally, CoDA presents a lightweight diffusion coder that achieves competitive results with larger models, while Rainbow Padding addresses the issue of early termination in instruction-tuned dLLMs. Other noteworthy papers include Self Speculative Decoding for Diffusion Large Language Models, ParallelBench, Finish First, Perfect Later, and Accelerating Diffusion LLM Inference via Local Determinism Propagation.

Sources

SelfJudge: Faster Speculative Decoding via Self-Supervised Judge Verification

DiffuSpec: Unlocking Diffusion Language Models for Speculative Decoding

CoDA: Coding LM via Diffusion Adaptation

Rainbow Padding: Mitigating Early Termination in Instruction-Tuned Diffusion LLMs

Self Speculative Decoding for Diffusion Large Language Models

ParallelBench: Understanding the Trade-offs of Parallel Decoding in Diffusion LLMs

Finish First, Perfect Later: Test-Time Token-Level Cross-Validation for Diffusion Large Language Models

Accelerating Diffusion LLM Inference via Local Determinism Propagation

Built with on top of