Uncovering Biases in Language Models

The field of language models is moving towards a deeper understanding of the biases present in these models and their effects on output. Recent studies have investigated the relationship between biased models and their thoughts, as well as the causal effects of social bias on faithfulness hallucinations. The use of techniques such as chain-of-thought prompting and structural causal models has shed light on the complex interactions between biases and model performance. Furthermore, research has explored the application of large language models in financial communication, highlighting the need for bias-aware annotation methodologies. Noteworthy papers include:

  • A study on the effect of chain-of-thought prompting on fairness, which found that biased models do not always possess biased thoughts.
  • An investigation into the causal effect of social bias on faithfulness hallucinations, which revealed that biases are significant causes of hallucinations.
  • A paper on argument quality annotation and gender bias detection in financial communication, which demonstrated the capabilities of large language models in annotating argument quality and detecting gender bias.

Sources

Do Biased Models Have Biased Thoughts?

Exploring Causal Effect of Social Bias on Faithfulness Hallucinations in Large Language Models

Argument Quality Annotation and Gender Bias Detection in Financial Communication through Large Language Models

Evaluating Contrast Localizer for Identifying Causal Unitsin Social & Mathematical Tasks in Language Models

Built with on top of