The field of diffusion models is rapidly evolving, with a focus on improving convergence, accelerating inference, and enhancing generative capabilities. Recent developments have led to the creation of more efficient and effective models, enabling faster and more accurate image generation, compression, and translation. Notable advancements include the use of structured latent spaces, similarity-aware feature reuse, and rectified flow for improved inversion processes. These innovations have significant implications for various applications, including image synthesis, steganography, and topology optimization.
Some noteworthy papers in this area include: DC-AE 1.5, which introduces a new family of deep compression autoencoders for high-resolution diffusion models, achieving faster convergence and better diffusion scaling results. Sortblock, a training-free inference acceleration framework that dynamically caches block-wise features based on their similarity across adjacent timesteps, resulting in over 2x inference speedup with minimal degradation in output quality. RF-Stego, a novel generative image steganography method that enables accurate latent inversion and significantly improves extraction accuracy, outperforming state-of-the-art methods in terms of extraction accuracy, image quality, robustness, security, and generation efficiency. HierarchicalPrune, a compression framework that synergistically combines hierarchical position pruning, positional weight preservation, and sensitivity-guided distillation to reduce the memory footprint and latency of large-scale diffusion models while preserving output quality. SODEC, a single-step diffusion image compression model that leverages a pre-trained VAE-based model and a fidelity guidance module to achieve superior rate-distortion-perception performance and improve decoding speed by more than 20x.