Advancements in Large Language Models for Mathematical Reasoning and Problem Solving

The field of large language models (LLMs) is rapidly advancing, with a focus on improving mathematical reasoning and problem-solving capabilities. Recent developments have led to the creation of large-scale datasets, such as the Open Proof Corpus, which enable the evaluation and improvement of LLMs in mathematical proof generation. Additionally, hybrid approaches that combine rule-based systems with LLMs have shown promise in automatic generation of mathematical conjectures. LLMs are also being applied to various domains, including network optimization, UAV control, and theorem proving, with notable successes in generating human-like action sequences and solving stochastic modeling problems. However, challenges remain, such as addressing cultural gaps in mathematical problem presentation and improving the reliability of LLM-driven systems. Noteworthy papers include the introduction of LeanConjecturer, a pipeline for automatic generation of mathematical conjectures, and Bourbaki, a modular system for theorem proving that achieves state-of-the-art results on university-level problems. Furthermore, the development of frameworks such as RALLY and NL2FLOW demonstrates the potential for LLMs to be used in complex problem-solving tasks, such as role-adaptive navigation and parametric problem generation.

Sources

The Open Proof Corpus: A Large-Scale Study of LLM-Generated Mathematical Proofs

LeanConjecturer: Automatic Generation of Mathematical Conjectures for Theorem Proving

LMPVC and Policy Bank: Adaptive voice control for industrial robots with code generating LLMs and reusable Pythonic policies

Concept-Level AI for Telecom: Moving Beyond Large Language Models

Bootstrapping Human-Like Planning via LLMs

Performance of LLMs on Stochastic Modeling Operations Research Problems: From Theory to Practice

Large Language Models Don't Make Sense of Word Problems. A Scoping Review from a Mathematics Education Perspective

Mathematics Isn't Culture-Free: Probing Cultural Gaps via Entity and Scenario Perturbations

RALLY: Role-Adaptive LLM-Driven Yoked Navigation for Agentic UAV Swarms

Frontiers of Generative AI for Network Optimization: Theories, Limits, and Visions

Large Language Model-Driven Closed-Loop UAV Operation with Semantic Observations

Scaling LLM Planning: NL2FLOW for Parametric Problem Generation and Rigorous Evaluation

On the Convergence of Large Language Model Optimizer for Black-Box Network Management

Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving