Efficient Reasoning in Large Models

The field of large language models is moving towards developing more efficient reasoning capabilities. Recent developments have focused on enabling models to adaptively decide when to engage in explicit reasoning and when to provide more succinct responses. This is achieved through various techniques, including multi-stage reinforcement learning, adaptive thinking mode switching, and internal self-recovery mechanisms. These advancements aim to reduce computational overhead and improve the accuracy of large reasoning models. Noteworthy papers in this area include Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL, which proposes a framework for equipping large reasoning models with adaptive thinking capabilities, and ThinkSwitcher: When to Think Hard, When to Think Fast, which introduces a framework for dynamically switching between short and long chain-of-thought modes based on task complexity. These innovative approaches have the potential to significantly improve the efficiency and reliability of large reasoning models.

Sources

Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL

ReaCritic: Large Reasoning Transformer-based DRL Critic-model Scaling For Heterogeneous Networks

Group Think: Multiple Concurrent Reasoning Agents Collaborating at Token Level Granularity

BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs

Counter-Inferential Behavior in Natural and Artificial Cognitive Systems

Parallel Belief Revision via Order Aggregation

ThinkSwitcher: When to Think Hard, When to Think Fast

Reasoning Models Better Express Their Confidence

A Logic of General Attention Using Edge-Conditioned Event Models (Extended Version)

Let LLMs Break Free from Overthinking via Self-Braking Tuning

Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training

Think Only When You Need with Large Hybrid-Reasoning Models

When Can Large Reasoning Models Save Thinking? Mechanistic Analysis of Behavioral Divergence in Reasoning

When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning

Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought

Incentivizing Dual Process Thinking for Efficient Large Language Model Reasoning

Internal Bias in Reasoning Models leads to Overthinking