Mitigating Biases in Large Language Models

The field of natural language processing is moving towards addressing the biases present in large language models (LLMs). Recent research has highlighted the need to reduce gender- and sexual-identity prejudices, as well as biases based on race, age, and political views. The development of parameter-efficient fine-tuning techniques and the creation of community-informed corpora are essential steps towards mitigating these biases. Additionally, the evaluation of LLMs in high-stakes applications has revealed the need for more thorough assessments to prevent harmful differences in medical care, wage gaps, and political factual realities. Notable papers in this area include: PRIDE, which demonstrates the effectiveness of Low-Rank Adaptation in reducing biases in LLMs. The Levers of Political Persuasion with Conversational AI, which shows that post-training and prompting methods can increase the persuasiveness of LLMs but also decrease factual accuracy. Language Models Change Facts Based on the Way You Talk, which highlights the sensitivity of LLMs to markers of identity in user queries. Unequal Voices: How LLMs Construct Constrained Queer Narratives, which reveals the limited portrayals of queer personas in LLM generations. The Prompt Makes the Person(a): A Systematic Evaluation of Sociodemographic Persona Prompting for Large Language Models, which provides guidance for designing sociodemographic persona prompts in LLM-based simulation studies.

Mitigating Biases in Large Language Models

Sources