Ultra-High-Resolution Image Generation Advances

The field of image generation is moving towards achieving ultra-high-resolution synthesis with improved fidelity and detail. Recent developments focus on addressing the challenges of scaling up diffusion models to higher resolutions, including reducing attention complexity and incorporating hierarchical local attention. Noteworthy papers include QSilk, which introduces a lightweight stabilization layer for latent diffusion, and Scale-DiT, which presents a new diffusion framework with hierarchical local attention. Additionally, the Positional Encoding Field and DyPE demonstrate innovative approaches to modeling geometry and extrapolating positional encoding for ultra-high-resolution image generation. The UltraHR-100K dataset also provides a valuable resource for training and evaluating ultra-high-resolution text-to-image models.

Sources

QSilk: Micrograin Stabilization and Adaptive Quantile Clipping for Detail-Friendly Latent Diffusion

Scale-DiT: Ultra-High-Resolution Image Generation with Hierarchical Local Attention

Positional Encoding Field

UltraHR-100K: Enhancing UHR Image Synthesis with A Large-Scale High-Quality Dataset

DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion

Built with on top of