Advancements in Efficient and Adaptive Reasoning for Large Language Models

The field of large language models (LLMs) is moving towards more efficient and adaptive reasoning methods. Recent research has focused on improving the accuracy and interpretability of LLMs while reducing their computational costs and latency. One notable direction is the development of methods that enable LLMs to reason in a more human-like way, by generating explicit step-by-step rationales for their decisions. However, this approach can be computationally expensive and may not always be necessary for simpler tasks. To address this, researchers have proposed various techniques for adaptive reasoning, which allow LLMs to adjust their reasoning depth and complexity based on the task at hand. These techniques include difficulty-adaptive reasoning, latent reasoning, and compressed knowledge distillation. Noteworthy papers in this area include Dual-Head Reasoning Distillation, which improves classifier accuracy with train-time-only reasoning, and MARCOS, which models reasoning as a hidden Markov chain of continuous thoughts. Overall, the field is moving towards more efficient, adaptive, and interpretable reasoning methods for LLMs, with potential applications in a wide range of areas, including natural language processing, computer vision, and decision-making.

Sources

Dual-Head Reasoning Distillation: Improving Classifier Accuracy with Train-Time-Only Reasoning

MMPlanner: Zero-Shot Multimodal Procedural Planning with Chain-of-Thought Object State Reasoning

Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts

Why Chain of Thought Fails in Clinical Text Understanding

Think Smart, Not Hard: Difficulty Adaptive Reasoning for Large Audio Language Models

R-Capsule: Compressing High-Level Plans for Efficient Large Language Model Reasoning

From Long to Lean: Performance-aware and Adaptive Chain-of-Thought Compression via Multi-round Refinement

UML-CoT: Structured Reasoning and Planning with Unified Modeling Language for Robotic Room Cleaning

Explore-Execute Chain: Towards an Efficient Structured Reasoning Paradigm

AdaThink-Med: Medical Adaptive Thinking with Uncertainty-Guided Length Calibration

KnowGuard: Knowledge-Driven Abstention for Multi-Round Clinical Reasoning

Expanding Computation Spaces of LLMs at Inference Time

MARCOS: Deep Thinking by Markov Chain of Continuous Thoughts

A Formal Comparison Between Chain-of-Thought and Latent Thought

ICL Optimized Fragility

CoT Vectors: Transferring and Probing the Reasoning Mechanisms of LLMs

Meaningless Tokens, Meaningful Gains: How Activation Shifts Enhance LLM Reasoning

Typed Chain-of-Thought: A Curry-Howard Framework for Verifying LLM Reasoning

Silent Tokens, Loud Effects: Padding in LLMs

Think Right: Learning to Mitigate Under-Over Thinking via Adaptive, Attentive Compression

FOR-Prompting: From Objection to Revision via an Asymmetric Prompting Protocol

Plan Then Action:High-Level Planning Guidance Reinforcement Learning for LLM Reasoning

KaVa: Latent Reasoning via Compressed KV-Cache Distillation

Built with on top of