Advances in Code Analysis and Generation

The field of code analysis and generation is moving towards improving the accuracy and reliability of large language models (LLMs) in detecting vulnerabilities and generating high-quality code. Researchers are focusing on developing new frameworks and benchmarks to evaluate the performance of LLMs in various scenarios, such as regression testing and type inference. Noteworthy papers in this area include ReCatcher, which presents a regression testing framework for Python code generation, and TypyBench, which introduces a benchmark for evaluating LLMs' type inference capabilities. Additionally, the paper on Out of Distribution, Out of Luck highlights the limitations of current vulnerability datasets and proposes a three-part solution to address these issues.

Sources

An Enumerative Embedding of the Python Type System in ACL2s

ReCatcher: Towards LLMs Regression Testing for Code Generation

Out of Distribution, Out of Luck: How Well Can LLMs Trained on Vulnerability Datasets Detect Top 25 CWE Weaknesses?

TypyBench: Evaluating LLM Type Inference for Untyped Python Repositories

Built with on top of