The field of machine learning and computational modeling is moving towards more efficient and effective methods for sampling and generation. Recent developments have focused on improving the speed and accuracy of algorithms, such as diffusion-based large language models and Markov Chain Monte Carlo methods. Notable advancements include the propose of Early Diffusion Inference Termination, which reduces diffusion steps by up to 68.3% while preserving accuracy, and the development of parallelizable samplers for Boltzmann machines, which enables more efficient learning and improved performance. Additionally, research on symplectic methods for stochastic Hamiltonian systems has shown promising results for long-time simulations. Other notable papers include: Unlocking the Power of Boltzmann Machines by Parallelizable Sampler and Efficient Temperature Estimation, which proposes a new sampler and temperature estimation method for Boltzmann machines, and Decoding Large Language Diffusion Models with Foreseeing Movement, which introduces a novel decoding method for large language diffusion models. Fast-Decoding Diffusion Language Models via Progress-Aware Confidence Schedules is also noteworthy as it presents a training-free, model-agnostic early-exit algorithm that achieves large, stable accelerations while retaining high performance.
Advances in Efficient Sampling and Generation
Sources
Making the RANMAR pseudorandom number generator in LAMMPS up to four times faster, with an implementation of jump-ahead
Ergodicity and invariant measure approximation of the stochastic Cahn-Hilliard equation via an explicit fully discrete scheme
Deconstructing Generative Diversity: An Information Bottleneck Analysis of Discrete Latent Generative Models
Unlocking the Power of Boltzmann Machines by Parallelizable Sampler and Efficient Temperature Estimation