Advances in Large Language Model Explanations and Verification

The field of large language models (LLMs) is moving towards a greater emphasis on explainability and verification. Recent research has focused on developing methods to analyze and understand how LLMs make decisions, with a particular focus on the interaction between external context knowledge and parametric knowledge stored in model weights. This has led to the development of novel frameworks for systematic studies of multi-step knowledge interactions in LLMs. Additionally, there is a growing interest in using explanation-enhanced fine-tuning to improve the accuracy and reliability of LLM classification. The use of chain-of-thought reasoning and majority voting with large language models has also been explored as a means of verifying the validity of reasoning steps in text-based decision making. Noteworthy papers in this area include: Regularization Through Reasoning: Systematic Improvements in Language Model Classification via Explanation-Enhanced Fine-Tuning, which demonstrates the effectiveness of explanation-augmented fine-tuning in improving accuracy and reliability. VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks, which introduces a neuro-symbolic method for extracting and verifying formal logical arguments from chain-of-thought reasoning.

Advances in Large Language Model Explanations and Verification

Sources