Advances in Mathematical Reasoning for Large Language Models

The field of mathematical reasoning for large language models (LLMs) is rapidly advancing, with a focus on improving logical reasoning, numerical reasoning, and multilingual support. Recent developments have highlighted the importance of adaptive selection of symbolic languages, joint logical-numerical reasoning, and robust test-time ensemble methods. Notably, researchers are exploring new benchmarks and datasets to evaluate LLMs' mathematical reasoning capabilities, such as MATH-Beyond and MathMist. These efforts aim to push the boundaries of LLMs' abilities in mathematical reasoning, addressing current limitations and gaps in existing models.

Noteworthy papers include: Adaptive Selection of Symbolic Languages for Improving LLM Logical Reasoning, which proposes a method to improve logical reasoning performance by adaptively selecting the most suitable symbolic language for each problem. LogiNumSynth: Synthesizing Joint Logical-Numerical Reasoning Problems for Language Models, which introduces a flexible natural language problem synthesizer to generate tasks requiring joint logical and numerical reasoning. MATH-Beyond, a benchmark designed to defeat common open-source models and require methods that learn to reason in ways that go beyond base model capabilities. Program of Thoughts for Financial Reasoning: Leveraging Dynamic In-Context Examples and Generative Retrieval, which achieves state-of-the-art performance on financial numerical reasoning datasets using a novel two-step framework. MathMist, a parallel multilingual benchmark dataset for mathematical problem solving and reasoning, which reveals persistent deficiencies in LLMs' ability to perform consistent and interpretable mathematical reasoning across languages.

Advances in Mathematical Reasoning for Large Language Models

Sources