LLM-Empowered Software Engineering Advances

The field of software engineering is undergoing a significant transformation with the integration of Large Language Models (LLMs). Recent developments have shifted the focus from traditional rule-based systems to sophisticated agentic systems capable of autonomous problem-solving. Researchers are exploring the application of LLMs in various aspects of software engineering, including code generation, testing, and quality engineering. A notable trend is the use of multi-agent systems, which have shown impressive performance in automated code generation and testing. However, the robustness of these systems remains a pressing concern, and researchers are working to address this issue through fuzzing-based testing approaches and repairing methods. Another area of focus is the development of benchmarks and evaluation frameworks to assess the efficiency and effectiveness of LLM-generated code. Noteworthy papers in this area include: A Comprehensive Survey on Benchmarks and Solutions in Software Engineering of LLM-Empowered Agentic System, which provides a holistic analysis of LLM-empowered software engineering and identifies critical research gaps. Agentic Property-Based Testing: Finding Bugs Across the Python Ecosystem, which demonstrates an LLM-based agent that can autonomously test software and identify valid bugs. Testing and Enhancing Multi-Agent Systems for Robust Code Generation, which presents a comprehensive study on the robustness of multi-agent systems for code generation and proposes effective mitigation strategies. Agentic RAG for Software Testing with Hybrid Vector-Graph and Multi-Agent Orchestration, which achieves remarkable accuracy improvements in software testing automation. LLM×MapReduce-V3: Enabling Interactive In-Depth Survey Generation through a MCP-Driven Hierarchically Modular Agent System, which introduces a hierarchically modular agent system for long-form survey generation. ArtNet: Hierarchical Clustering-Based Artificial Netlist Generator for ML and DTCO Application, which proposes a novel artificial netlist generator for machine learning and design-technology co-optimization applications. Pluto: A Benchmark for Evaluating Efficiency of LLM-generated Hardware Code, which presents a benchmark and evaluation framework to assess the efficiency of LLM-generated Verilog designs.

LLM-Empowered Software Engineering Advances

Sources