Advances in Bias Detection and Mitigation in NLP

The field of Natural Language Processing (NLP) is moving towards a greater emphasis on detecting and mitigating biases in language models. Recent studies have highlighted the importance of evaluating language models for demographic-targeted social biases and developing scalable bias-detection methods. The use of large language models has also been shown to perpetuate harmful stereotypes and biases, particularly in low-resource languages and culturally diverse contexts. To address these issues, researchers are proposing new evaluation frameworks, datasets, and methods for bias detection and mitigation, such as fine-tuning models to reflect desired distributions and using contrastive learning to capture fine-grained bias. Notable papers in this area include KurdSTS, which presents a Kurdish semantic textual similarity dataset, and IndiCASA, which introduces a dataset and bias evaluation framework for large language models in the Indian context. Other notable papers include Evaluating LLMs for Demographic-Targeted Social Bias Detection, which presents a comprehensive evaluation framework for assessing the ability of large language models to detect demographic-targeted social biases, and LLM Bias Detection and Mitigation through the Lens of Desired Distributions, which proposes a weighted adaptive loss-based fine-tuning method for aligning language models with desired distributions.

Sources

KurdSTS: The Kurdish Semantic Textual Similarity

IndiCASA: A Dataset and Bias Evaluation Framework in LLMs Using Contrastive Embedding Similarity in the Indian Context

What is a protest anyway? Codebook conceptualization is still a first-order concern in LLM-era classification

Person-Centric Annotations of LAION-400M: Auditing Bias and Its Transfer to Models

Read Between the Lines: A Benchmark for Uncovering Political Bias in Bangla News Articles

Enhancing OCR for Sino-Vietnamese Language Processing via Fine-tuned PaddleOCRv5

Quantifying Gender Stereotypes in Japan between 1900 and 1999 with Word Embeddings

Evaluating LLMs for Demographic-Targeted Social Bias Detection: A Comprehensive Benchmark Study

Language models for longitudinal analysis of abusive content in Billboard Music Charts

Surgeons Are Indian Males and Speech Therapists Are White Females: Auditing Biases in Vision-Language Models for Healthcare Professionals

LLM Bias Detection and Mitigation through the Lens of Desired Distributions

Media Coverage of War Victims: Journalistic Biases in Reporting on Israel and Gaza

Evaluating LLMs for Historical Document OCR: A Methodological Framework for Digital Humanities

Unpacking Discourses on Childbirth and Parenthood in Popular Social Media Platforms Across China, Japan, and South Korea

Probing Social Identity Bias in Chinese LLMs with Gendered Pronouns and Social Groups

Regulating Social Media: Surveying the Impact of Nepali Government's TikTok Ban