Advances in Text-to-Image Generation

The field of text-to-image generation is rapidly evolving, with a focus on improving the quality and control of generated images. Recent developments have centered around enhancing the personalization of text-to-image diffusion models, allowing for more diverse and accurate image generation. Additionally, there has been a push towards developing more effective methods for detecting and preventing the generation of Not Safe For Work (NSFW) content.

Noteworthy papers in this area include: LAMIC, which introduces a layout-aware multi-image composition framework that achieves state-of-the-art performance in controllable image synthesis. Wukong, a transformer-based NSFW detection framework that leverages intermediate outputs from early denoising steps to enable early detection without waiting for full image generation. YOLO-Count, a differentiable open-vocabulary object counting model that enables precise quantity control for text-to-image generation. UNCAGE, a novel training-free method that improves compositional fidelity by leveraging attention maps to prioritize the unmasking of tokens that clearly represent individual objects.

Sources

Steering Guidance for Personalized Text-to-Image Diffusion Models

LAMIC: Layout-Aware Multi-Image Composition via Scalability of Multimodal Diffusion Transformer

Wukong Framework for Not Safe For Work Detection in Text-to-Image systems

YOLO-Count: Differentiable Object Counting for Text-to-Image Generation

Diffusion Models with Adaptive Negative Sampling Without External Resources

Seeing It Before It Happens: In-Generation NSFW Detection for Diffusion-Based Text-to-Image Models

Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models

PLA: Prompt Learning Attack against Text-to-Image Generative Models

Zero-Residual Concept Erasure via Progressive Alignment in Text-to-Image Model

UNCAGE: Contrastive Attention Guidance for Masked Generative Transformers in Text-to-Image Generation

Built with on top of