Efficient Reasoning in Large Language Models

The field of Large Language Models (LLMs) is moving towards developing more efficient and controllable reasoning capabilities. Researchers are exploring various methods to mitigate overthinking, reduce token consumption, and improve accuracy. One notable direction is the use of budget-aware reasoning, which enables precise control over the length of thought processes. Another approach is to allocate reasoning resources adaptively, allowing models to generate concise answers for simple questions while retaining sufficient reasoning depth for more challenging ones. Noteworthy papers include: BudgetThinker, which introduces a novel framework for empowering LLMs with budget-aware reasoning. DRQA, which proposes a method for dynamic reasoning quota allocation to control overthinking in RLLMs. ThinkDial, which presents an open-recipe end-to-end framework for controllable reasoning through discrete operational modes.

Sources

BudgetThinker: Empowering Budget-aware LLM Reasoning with Control Tokens

DRQA: Dynamic Reasoning Quota Allocation for Controlling Overthinking in Reasoning Large Language Models

Stop Spinning Wheels: Mitigating LLM Overthinking via Mining Patterns for Early Reasoning Exit

CAC-CoT: Connector-Aware Compact Chain-of-Thought for Efficient Reasoning Data Synthesis Across Dual-System Cognitive Tasks

ThinkDial: An Open Recipe for Controlling Reasoning Effort in Large Language Models

SLIM: Subtrajectory-Level Elimination for More Effective Reasoning

Built with on top of