Efficient Reasoning in Large Language Models

The field of large language models is moving towards more efficient and effective reasoning capabilities. Recent research has focused on mitigating overthinking, a common issue where models generate excessively long and redundant reasoning chains. Various approaches have been proposed to address this issue, including dynamic compression, conditional token selection, and manifold steering. These methods aim to reduce computational overhead and improve the accuracy of large language models. Notably, some papers have introduced novel frameworks and techniques, such as Auto Long-Short Reasoning and State Machine Reasoning, which enable models to adaptively control their reasoning depth and generate more concise reasoning paths. Noteworthy papers include Amplify Adjacent Token Differences, which proposes a novel approach to mitigate Cyclical Reasoning, and TrimR, which introduces a verifier-based framework for dynamic CoT compression. ConciseRL is also notable for its conciseness-guided reinforcement learning framework, which guides models toward generating correct and concise reasoning traces.

Sources

Amplify Adjacent Token Differences: Enhancing Long Chain-of-Thought Reasoning with Shift-FFN

TrimR: Verifier-based Training-Free Thinking Compression for Efficient Test-Time Scaling

ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models

Select2Reason: Efficient Instruction-Tuning Data Selection for Long-CoT Reasoning

Reasoning Meets Personalization: Unleashing the Potential of Large Reasoning Model for Personalized Generation

Activation Control for Efficiently Eliciting Long Chain-of-thought Ability of Language Models

Fast Quiet-STaR: Thinking Without Thought Tokens

Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning

Not All Tokens Are What You Need In Thinking

VeriThinker: Learning to Verify Makes Reasoning Model Efficient

Extended Inductive Reasoning for Personalized Preference Inference from Behavioral Signals

Don't Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models

CoThink: Token-Efficient Reasoning via Instruct Models Guiding Reasoning Models

THINK-Bench: Evaluating Thinking Efficiency and Chain-of-Thought Quality of Large Reasoning Models

Chain-of-Thought for Large Language Model-empowered Wireless Communications

Mitigating Overthinking in Large Reasoning Models via Manifold Steering

Scaling Reasoning without Attention

Emotion-o1: Adaptive Long Reasoning for Emotion Understanding in LLMs

AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models

Scalable Complexity Control Facilitates Reasoning Ability of LLMs

From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval

Revisiting Overthinking in Long Chain-of-Thought from the Perspective of Self-Doubt