Mitigating Social Biases in AI Models

The field of artificial intelligence is moving towards developing more fair and unbiased models. Recent research has focused on identifying and mitigating social biases in large language models and vision-language models. The use of metamorphic relations, difference-aware fairness, and model merging algorithms have shown promising results in reducing biases while maintaining model performance. Fully unsupervised self-debiasing methods have also been proposed for text-to-image diffusion models. Noteworthy papers include: Bias Testing and Mitigation in Black Box LLMs using Metamorphic Relations, which introduces a unified framework for bias evaluation and mitigation. BioPro: On Difference-Aware Gender Fairness for Vision-Language Models, which proposes a training-free framework for selective debiasing. Fully Unsupervised Self-debiasing of Text-to-Image Diffusion Models, which introduces a test-time debiasing method applicable to any diffusion model.

Mitigating Social Biases in AI Models

Sources