Large Language Models in Software Engineering: Improved Code Generation and Analysis

The field of software engineering is witnessing significant advancements with the integration of Large Language Models (LLMs). Recent studies have focused on enhancing code generation, improving code quality, and developing more efficient bug detection and repair methods. The general direction of the field is moving towards leveraging LLMs to automate and optimize various software engineering tasks, such as code generation, code review, and testing. Notably, researchers are exploring the use of LLMs in conjunction with other techniques, like static analysis and fuzzing, to improve the accuracy and efficiency of these tasks. Furthermore, there is a growing interest in developing frameworks and benchmarks to evaluate the performance of LLMs in software engineering tasks, ensuring their reliability and effectiveness in real-world applications. Some noteworthy papers in this regard include 'Zero-Shot Detection of LLM-Generated Code via Approximated Task Conditioning', which proposes a novel approach for detecting LLM-generated code, and 'EXPEREPAIR: Dual-Memory Enhanced LLM-based Repository-Level Program Repair', which presents a dual-memory based approach for program repair using LLMs.

Sources

Which Prompting Technique Should I Use? An Empirical Investigation of Prompting Techniques for Software Engineering Tasks

Deployability-Centric Infrastructure-as-Code Generation: An LLM-based Iterative Framework

SafeGenBench: A Benchmark Framework for Security Vulnerability Detection in LLM-Generated Code

Zero-Shot Detection of LLM-Generated Code via Approximated Task Conditioning

Repeton: Structured Bug Repair with ReAct-Guided Patch-and-Test Cycles

Understanding Software Engineering Agents Through the Lens of Traceability: An Empirical Study

How Good LLM-Generated Password Policies Are?

Detecting State Manipulation Vulnerabilities in Smart Contracts Using LLM and Static Analysis

Evaluating the Performance and Efficiency of Sentence-BERT for Code Comment Classification

ZTaint-Havoc: From Havoc Mode to Zero-Execution Fuzzing-Driven Taint Inference

LLM-as-a-qualitative-judge: automating error analysis in natural language generation

ASTAGEN: Empirical Evaluation of Automated SATD Taxonomy Generation with LLMs

Expert-in-the-Loop Systems with Cross-Domain and In-Domain Few-Shot Learning for Software Vulnerability Detection

D-LiFT: Improving LLM-based Decompiler Backend via Code Quality-driven Fine-tuning

Prompt Variability Effects On LLM Code Generation

Minimizing False Positives in Static Bug Detection via LLM-Enhanced Path Feasibility Analysis

ELFuzz: Efficient Input Generation via LLM-driven Synthesis Over Fuzzer Space

Augmenting Large Language Models with Static Code Analysis for Automated Code Quality Improvements

AutoGEEval++: A Multi-Level and Multi-Geospatial-Modality Automated Evaluation Framework for Large Language Models in Geospatial Code Generation on Google Earth Engine

Bug Classification in Quantum Software: A Rule-Based Framework and Its Evaluation

EXPEREPAIR: Dual-Memory Enhanced LLM-based Repository-Level Program Repair

BugGen: A Self-Correcting Multi-Agent LLM Pipeline for Realistic RTL Bug Synthesis

Evaluating Large Language Models on Non-Code Software Engineering Tasks