Advancements in Large Language Models for Autonomous Systems and Data Management

The field of large language models (LLMs) is rapidly advancing, with a focus on improving their ability to interact with complex systems and manage large datasets. Recent developments have highlighted the potential of LLMs in autonomous driving, software engineering, and data management. Researchers are exploring new approaches to enhance the reliability and accuracy of LLMs, such as iterative error correction and spatially-aware prompting. Additionally, there is a growing interest in evaluating the performance of LLMs in multi-library scenarios and tool-calling error situations. Notable papers in this area include: Technical Report for Argoverse2 Scenario Mining Challenges, which introduces a fault-tolerant iterative code-generation mechanism and specialized prompt engineering to improve scenario mining. CRITICTOOL, which presents a comprehensive critique evaluation benchmark for tool learning and error handling in LLMs. MLDebugging, which proposes a benchmark for debugging code across multi-library scenarios and highlights the limitations of current LLMs in this area.

Sources

Technical Report for Argoverse2 Scenario Mining Challenges on Iterative Error Correction and Spatially-Aware Prompting

Invocable APIs derived from NL2SQL datasets for LLM Tool-Calling Evaluation

An Empirical study on LLM-based Log Retrieval for Software Engineering Metadata Management

LLM-Driven Data Generation and a Novel Soft Metric for Evaluating Text-to-SQL in Aviation MRO

MLDebugging: Towards Benchmarking Code Debugging Across Multi-Library Scenarios

CRITICTOOL: Evaluating Self-Critique Capabilities of Large Language Models in Tool-Calling Error Scenarios

Built with on top of