Advances in Text-to-SQL Research

The field of Text-to-SQL research is rapidly advancing, with a focus on improving the accuracy and robustness of models that can translate natural language queries into SQL executions. Recent developments have seen the introduction of new frameworks, benchmarks, and techniques that are pushing the boundaries of what is possible in this area. One key trend is the use of reinforcement learning and test-time scaling to improve model performance, with some models achieving state-of-the-art results on challenging benchmarks like BIRD. Another area of focus is the development of multilingual Text-to-SQL models, which can handle queries in multiple languages and are robust across different languages and databases. The creation of new benchmarks and evaluation frameworks, such as GeoSQL-Eval and DeepJSONEval, is also enabling more comprehensive assessments of model performance and driving progress in the field. Notable papers in this area include Agentar-Scale-SQL, which achieved SOTA performance on the BIRD benchmark, and SING-SQL, which introduced a novel framework for generating high-quality synthetic Text-to-SQL data. Additionally, Thinkquel presented a fine-tuned model for producing robust and portable database queries, and Exploring Database Normalization Effects on SQL Generation highlighted the importance of considering schema design when developing NL2SQL interfaces.

Sources

A State-of-the-Art SQL Reasoning Model using RLVR

QueryGym: Step-by-Step Interaction with Relational Databases

ScenarioBench: Trace-Grounded Compliance Evaluation for Text-to-SQL and RAG

Agentar-Scale-SQL: Advancing Text-to-SQL through Orchestrated Test-Time Scaling

Multilingual Text-to-SQL: Benchmarking the Limits of Language Models with Collaborative Language Agents

From NL2SQL to NL2GeoSQL: GeoSQL-Eval for automated evaluation of LLMs on PostGIS queries

SING-SQL: A Synthetic Data Generation Framework for In-Domain Text-to-SQL Translation

DeepJSONEval: Benchmarking Complex Nested JSON Data Mining for Large Language Models

Thinkquel: A Model Dedicated to Text-to-dbt Using Synthetic Data and a Span-Aware Objective

Exploring Database Normalization Effects on SQL Generation

Built with on top of