Advancements in Text-to-Structure and SQL Generation

The field of natural language processing is witnessing significant developments in text-to-structure and SQL generation, with a focus on improving the accuracy and reliability of large language models. Recent research has highlighted the importance of evaluating and improving the robustness of these models, particularly in safety-critical applications such as electronic health records. The introduction of new benchmarks and evaluation metrics has enabled more comprehensive assessments of model performance, revealing key challenges and areas for future research. Noteworthy papers in this area include SCARE, which introduces a benchmark for evaluating post-hoc verification mechanisms in EHR question answering systems, and OmniStruct, which presents a comprehensive benchmark for assessing LLMs' capabilities on diverse text-to-structure tasks. Other notable works, such as Skeletons Matter and RoParQ, have proposed innovative approaches to dynamic data augmentation and paraphrase-aware alignment, demonstrating the potential for significant improvements in model performance and robustness.

Sources

SCARE: A Benchmark for SQL Correction and Question Answerability Classification for Reliable EHR Question Answering

OmniStruct: Universal Text-to-Structure Generation across Diverse Schemas

Skeletons Matter: Dynamic Data Augmentation for Text-to-Query

Benchmarking Corruption Robustness of LVLMs: A Discriminative Benchmark and Robustness Alignment Metric

Prompt Engineering Techniques for Context-dependent Text-to-SQL in Arabic

Generating Querying Code from Text for Multi-Modal Electronic Health Record

TrackList: Tracing Back Query Linguistic Diversity for Head and Tail Knowledge in Open Large Language Models

Text-to-SQL as Dual-State Reasoning: Integrating Adaptive Context and Progressive Generation

RoParQ: Paraphrase-Aware Alignment of Large Language Models Towards Robustness to Paraphrased Questions

Built with on top of