Diffusion Model Advancements

The field of diffusion models is rapidly evolving, with a focus on improving convergence, accelerating inference, and enhancing generative capabilities. Recent developments have led to the creation of more efficient and effective models, enabling faster and more accurate image generation, compression, and translation. Notable advancements include the use of structured latent spaces, similarity-aware feature reuse, and rectified flow for improved inversion processes. These innovations have significant implications for various applications, including image synthesis, steganography, and topology optimization.

Some noteworthy papers in this area include: DC-AE 1.5, which introduces a new family of deep compression autoencoders for high-resolution diffusion models, achieving faster convergence and better diffusion scaling results. Sortblock, a training-free inference acceleration framework that dynamically caches block-wise features based on their similarity across adjacent timesteps, resulting in over 2x inference speedup with minimal degradation in output quality. RF-Stego, a novel generative image steganography method that enables accurate latent inversion and significantly improves extraction accuracy, outperforming state-of-the-art methods in terms of extraction accuracy, image quality, robustness, security, and generation efficiency. HierarchicalPrune, a compression framework that synergistically combines hierarchical position pruning, positional weight preservation, and sensitivity-guided distillation to reduce the memory footprint and latency of large-scale diffusion models while preserving output quality. SODEC, a single-step diffusion image compression model that leverages a pre-trained VAE-based model and a fidelity guidance module to achieve superior rate-distortion-perception performance and improve decoding speed by more than 20x.

Sources

DC-AE 1.5: Accelerating Diffusion Model Convergence with Structured Latent Space

Sortblock: Similarity-Aware Feature Reuse for Diffusion Model

Accurate Latent Inversion for Generative Image Steganography via Rectified Flow

Learning Latent Representations for Image Translation using Frequency Distributed CycleGAN

HierarchicalPrune: Position-Aware Compression for Large-Scale Diffusion Models

Steering One-Step Diffusion Model with Fidelity-Rich Decoder for Fast Image Compression

Latent Space Diffusion for Topology Optimization

Built with on top of