Mitigating Social Biases in AI Models

The field of artificial intelligence is moving towards developing more fair and unbiased models. Recent research has focused on identifying and mitigating social biases in large language models and vision-language models. The use of metamorphic relations, difference-aware fairness, and model merging algorithms have shown promising results in reducing biases while maintaining model performance. Fully unsupervised self-debiasing methods have also been proposed for text-to-image diffusion models. Noteworthy papers include: Bias Testing and Mitigation in Black Box LLMs using Metamorphic Relations, which introduces a unified framework for bias evaluation and mitigation. BioPro: On Difference-Aware Gender Fairness for Vision-Language Models, which proposes a training-free framework for selective debiasing. Fully Unsupervised Self-debiasing of Text-to-Image Diffusion Models, which introduces a test-time debiasing method applicable to any diffusion model.

Sources

Bias Testing and Mitigation in Black Box LLMs using Metamorphic Relations

BioPro: On Difference-Aware Gender Fairness for Vision-Language Models

An Empirical Survey of Model Merging Algorithms for Social Bias Mitigation

Fully Unsupervised Self-debiasing of Text-to-Image Diffusion Models

Geschlechts\"ubergreifende Maskulina im Sprachgebrauch Eine korpusbasierte Untersuchung zu lexemspezifischen Unterschieden

Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models

Built with on top of