Advances in Text-to-Image Models and Fairness

The field of text-to-image models is rapidly advancing, with a focus on improving fairness and reducing biases in generated images. Recent developments have introduced novel frameworks for debiasing and evaluating text-to-image models, such as post-hoc debiasing methods and multimodal reward modeling. These advancements have the potential to significantly improve the fairness and accuracy of text-to-image generation. Noteworthy papers include FairImagen, which introduces a post-hoc debiasing framework, and FairJudge, which presents a protocol for evaluating text-to-image models. Additionally, papers like Semantic Surgery and SceneDecorator have made significant contributions to concept erasure and scene-oriented story generation, respectively. Overall, the field is moving towards more equitable and accurate text-to-image generation, with a growing emphasis on fairness, transparency, and accountability.

Sources

FairImagen: Post-Processing for Bias Mitigation in Text-to-Image Models

Prompt fidelity of ChatGPT4o / Dall-E3 text-to-image visualisations

AesCrop: Aesthetic-driven Cropping Guided by Composition

FairJudge: MLLM Judging for Social Attributes and Prompt Image Alignment

Semantic Surgery: Zero-Shot Concept Erasure in Diffusion Models

SceneDecorator: Towards Scene-Oriented Story Generation with Scene Planning and Scene Consistency

M$^{3}$T2IBench: A Large-Scale Multi-Category, Multi-Instance, Multi-Relation Text-to-Image Benchmark

Beyond Direct Generation: A Decomposed Approach to Well-Crafted Screenwriting with LLMs

Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences

More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models

MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency