Bias and Fairness in Language Models

The field of natural language processing is seeing a significant shift towards addressing bias and fairness in language models. Researchers are working to develop more comprehensive benchmarks and evaluation tasks to assess the fairness of language models across diverse cultures and identities. This includes creating datasets that are representative of diverse populations and evaluating language models on their ability to mitigate bias and stereotypes. The use of multi-task learning approaches is also being explored as a way to enhance bias detection and fairness in language models. Noteworthy papers in this area include:

  • FairI Tales, which introduces a comprehensive India-centric benchmark to evaluate fairness of language models across 85 identity groups.
  • NEU-ESC, which presents a new Vietnamese dataset for educational sentiment analysis and topic classification.
  • Stereotype Detection as a Catalyst for Enhanced Bias Detection, which explores the use of joint training on bias and stereotype detection to improve model performance.
  • McBE, which presents a multi-task Chinese bias evaluation benchmark for large language models.

Sources

FairI Tales: Evaluation of Fairness in Indian Contexts with a Focus on Bias and Stereotypes

NEU-ESC: A Comprehensive Vietnamese dataset for Educational Sentiment analysis and topic Classification toward multitask learning

Stereotype Detection as a Catalyst for Enhanced Bias Detection: A Multi-Task Learning Approach

McBE: A Multi-task Chinese Bias Evaluation Benchmark for Large Language Models

Built with on top of