Advancements in Tabular Data Processing with Large Language Models

The field of tabular data processing is witnessing significant advancements with the integration of large language models (LLMs). Recent research has focused on improving the performance of LLMs on tabular data by developing innovative frameworks and techniques. One notable direction is the use of process-based preference learning, which enables LLMs to improve their performance on table question answering tasks without requiring extensive manually annotated data. Another area of research is the development of novel architectures that combine the strengths of LLMs with traditional decision tree-based approaches, resulting in more accurate and efficient models. Additionally, there is a growing interest in using reinforcement learning to enhance the reasoning capabilities of LLMs on tabular data, leading to more accurate and explainable predictions. Overall, these advancements have the potential to significantly impact various applications that rely on tabular data processing, such as financial analysis, healthcare, and more. Noteworthy papers include: PPT, which proposes a process-based preference learning framework for self-improving table question answering models, and DeLTa, which integrates LLMs into tabular data through logical decision tree rules. Fortune is also notable, as it proposes a reinforcement learning framework that trains LLMs to generate executable spreadsheet formulas for question answering over general tabular data.

Sources

PPT: A Process-based Preference Learning Framework for Self Improving Table Question Answering Models

LLM Meeting Decision Trees on Tabular Data

TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations

TabReason: A Reinforcement Learning-Enhanced Reasoning LLM for Explainable Tabular Data Prediction

MRT at SemEval-2025 Task 8: Maximizing Recovery from Tables with Multiple Steps

Learning Interpretable Differentiable Logic Networks for Tabular Regression

Table-R1: Inference-Time Scaling for Table Reasoning

Fortune: Formula-Driven Reinforcement Learning for Symbolic Table Reasoning in Language Models

Built with on top of