Advances in Large Language Model Reasoning and Trustworthiness

The field of large language models (LLMs) is rapidly advancing, with a focus on improving reasoning capabilities and trustworthiness. Recent developments have explored the use of reinforcement learning, self-supervised learning, and multi-step reasoning to enhance the accuracy and reliability of LLMs. Notably, researchers have proposed novel methods to mitigate hallucinations, improve confidence estimation, and align LLMs with human reasoning. These advancements have significant implications for real-world applications, such as education and decision-making.

Some noteworthy papers in this area include: Honesty over Accuracy: Trustworthy Language Models through Reinforced Hesitation, which proposes a modification to Reinforcement Learning from Verifiable Rewards to use ternary rewards and introduces two inference strategies that exploit trained abstention as a coordination signal. Reason-KE++: Aligning the Process, Not Just the Outcome, for Faithful LLM Knowledge Editing, which proposes an SFT+RL framework that instills process-level faithfulness and provides dense supervision for intermediate reasoning steps. Spark-Prover-X1: Formal Theorem Proving Through Diverse Data Training, which introduces a 7B parameter model trained via a three-stage framework designed to unlock the reasoning potential of more accessible and moderately-sized LLMs.

Sources

Where does an LLM begin computing an instruction?

Honesty over Accuracy: Trustworthy Language Models through Reinforced Hesitation

Better LLM Reasoning via Dual-Play

Incremental Maintenance of DatalogMTL Materialisations

Reason-KE++: Aligning the Process, Not Just the Outcome, for Faithful LLM Knowledge Editing

Scaling Generative Verifiers For Natural Language Mathematical Proof Verification And Selection

Spark-Prover-X1: Formal Theorem Proving Through Diverse Data Training

ALEX:A Light Editing-knowledge Extractor

Don't Miss the Forest for the Trees: In-Depth Confidence Estimation for LLMs via Reasoning over the Answer Space

SMRC: Aligning Large Language Models with Student Reasoning for Mathematical Error Correction

Temporal Predictors of Outcome in Reasoning Language Models

From Solving to Verifying: A Unified Objective for Robust Reasoning in LLMs

Thinking, Faithful and Stable: Mitigating Hallucinations in LLMs

Incorporating Self-Rewriting into Large Language Model Reasoning Reinforcement

Built with on top of