Efficient Reasoning in Large Language Models

The field of large language models is moving towards more efficient reasoning capabilities, with a focus on reducing computational costs and latency while maintaining accuracy. Researchers are exploring various approaches, including pruning, sparse attention, and curriculum learning, to achieve this goal. Notable papers in this area include Pruning the Unsurprising, which proposes a novel coarse-to-fine framework for Chain-of-Thought compression, and Less Is More, which introduces a training-free sparse attention mechanism for efficient reasoning. Other notable papers include ReasonRank, which empowers passage ranking with strong reasoning ability, and Klear-Reasoner, which advances reasoning capability via gradient-preserving clipping policy optimization. Additionally, papers such as Train Long, Think Short and Sample More to Think Less demonstrate the effectiveness of curriculum learning and group filtered policy optimization for efficient reasoning. Overall, the field is making significant progress in developing more efficient and effective reasoning models.

Sources

Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal

Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning

ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability

Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization

Train Long, Think Short: Curriculum Learning for Efficient Reasoning

TFRank: Think-Free Reasoning Enables Practical Pointwise LLM Ranking

Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning

SABER: Switchable and Balanced Training for Efficient LLM Reasoning

Pruning Long Chain-of-Thought of Large Reasoning Models via Small-Scale Preference Optimization

Built with on top of