Advancements in Large Language Models for Complex Reasoning and Information Seeking

The field of large language models (LLMs) is moving towards more advanced and complex reasoning capabilities, with a focus on improving information seeking and multi-turn conversation systems. Researchers are exploring new frameworks and techniques to enhance the performance of LLMs in tasks such as route planning, open-web question answering, and mathematical reasoning. Notable developments include the integration of knowledge distillation and reinforcement learning, the use of neurosymbolic reasoning for multilingual tasks, and the empowerment of LLMs to autonomously control the search process. Noteworthy papers include Pangu DeepDiver, which introduces a novel framework for adaptive search intensity scaling, and LLM-First Search, which enables LLMs to autonomously control the search process. Additionally, papers such as Soft Reasoning and Harnessing Negative Signals demonstrate innovative approaches to improving LLM performance in complex reasoning tasks.

Sources

Proactive Guidance of Multi-Turn Conversation in Industrial Search

GridRoute: A Benchmark for LLM-Based Route Planning with Cardinal Movement in Grid Environments

Pangu DeepDiver: Adaptive Search Intensity Scaling via Open-Web Reinforcement Learning

Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration

Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning

KDRL: Post-Training Reasoning LLMs via Unified Knowledge Distillation and Reinforcement Learning

Enhancing Large Language Models with Neurosymbolic Reasoning for Multilingual Tasks

Cell-o1: Training LLMs to Solve Single-Cell Reasoning Puzzles with Reinforcement Learning

R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning

LLM-First Search: Self-Guided Exploration of the Solution Space