Large Language Models in Qualitative Coding and Annotation

The field of natural language processing is witnessing a significant shift towards leveraging large language models (LLMs) for qualitative coding and annotation tasks. Researchers are exploring the potential of LLMs to perform deductive classification tasks, assist in annotation, and evaluate the quality of generated code and summaries. The results indicate that targeted interventions can improve the reliability of LLMs in these tasks, and that they can achieve substantial agreement with human-coded schemes. However, the use of LLMs also raises important questions about the impact of LLM-assisted annotation on subjective tasks and the creation of gold data for training and testing. Noteworthy papers in this area include: Assessing the Reliability of Large Language Models for Deductive Qualitative Coding, which demonstrates the potential of LLMs to achieve reliability levels suitable for integration into rigorous qualitative coding workflows. Just Put a Human in the Loop? Investigating LLM-Assisted Annotation for Subjective Tasks, which highlights the importance of understanding the impact of LLM-assisted annotation on subjective tasks. On the Effectiveness of LLM-as-a-judge for Code Generation and Summarization, which explores the effectiveness of LLMs as judges for complex natural language processing tasks. Can External Validation Tools Improve Annotation Quality for LLM-as-a-Judge?, which proposes a tool-using agentic system to provide higher quality feedback on challenging response domains.

Large Language Models in Qualitative Coding and Annotation

Sources