Advances in Secure Code Generation with Large Language Models

The field of large language models (LLMs) is moving towards improving the security and reliability of generated code. Recent research has focused on developing innovative methods to detect secrets in source code, generate secure smart contracts, and evaluate the security compliance of LLM-generated code. Notably, fine-tuned LLMs have shown promising results in reducing false positives and improving recall in secret detection, while also generating secure code that aligns with industry best practices. Furthermore, offline simulation frameworks have been proposed to improve the automation of software scripting tasks, leveraging LLMs to generate verified scripts. However, the security risks associated with LLM-generated code remain a concern, highlighting the need for robust security assessment frameworks. Overall, the field is advancing towards more secure and reliable code generation capabilities. Noteworthy papers include: Secret Breach Detection in Source Code with Large Language Models, which achieved high performance in secret detection using fine-tuned LLMs. CodeBC: A More Secure Large Language Model for Smart Contract Code Generation in Blockchain, which introduced a three-stage fine-tuning approach to generate secure smart contracts.

Sources

Evaluating Machine Expertise: How Graduate Students Develop Frameworks for Assessing GenAI Content

Secret Breach Detection in Source Code with Large Language Models

Skill Discovery for Software Scripting Automation via Offline Simulations with LLMs

The Hidden Risks of LLM-Generated Web Application Code: A Security-Centric Evaluation of Code Generation Capabilities in Large Language Models

CodeBC: A More Secure Large Language Model for Smart Contract Code Generation in Blockchain

SecRepoBench: Benchmarking LLMs for Secure Code Generation in Real-World Repositories

Assessing LLM code generation quality through path planning tasks

Built with on top of