Accelerating Autoregressive Models for Image Generation

The field of autoregressive models for image generation is moving towards faster and more efficient architectures. Recent developments have focused on reducing the computational complexity of these models, while maintaining their high-quality image generation capabilities. One of the key directions is the use of conditional score distillation, which enables one-step sampling for image autoregressive models. Another approach is the use of nested autoregressive architectures, which can reduce the overall complexity of the model and increase image diversity. Additionally, there is a growing interest in leveraging spatial context to accelerate autoregressive text-to-image generation. Notable papers in this area include: Distilled Decoding 2, which achieves one-step sampling for image autoregressive models with minimal performance degradation. Nested AutoRegressive Models, which proposes a nested autoregressive architecture to reduce computational complexity and increase image diversity. FARMER, which unifies normalizing flows and autoregressive models for tractable likelihood estimation and high-quality image synthesis. Hawk, which leverages spatial context to accelerate autoregressive text-to-image generation.

Sources

Distilled Decoding 2: One-step Sampling of Image Auto-regressive Models with Conditional Score Distillation

Nested AutoRegressive Models

Autoregressive Styled Text Image Generation, but Make it Reliable

FARMER: Flow AutoRegressive Transformer over Pixels

Hawk: Leveraging Spatial Context for Faster Autoregressive Text-to-Image Generation

Built with on top of