Advancements in Code Analysis and Generation

The field of code analysis and generation is rapidly advancing, with a focus on improving the accuracy and efficiency of large language models (LLMs) in coding tasks. Recent research has highlighted the potential of LLMs in code generation, code completion, and code review, as well as their limitations in terms of code quality and security. To address these limitations, researchers are exploring new approaches to code analysis and generation, including the use of intermediate representations, chain-of-thought prompting, and multimodal specification extraction. These advancements have the potential to significantly improve the quality and reliability of software development, and to enable new applications such as automated code review and repair. Notably, papers such as RTNinja and White-Basilisk have made significant contributions to the field by introducing novel frameworks for analyzing random telegraph noise signals and detecting code vulnerabilities. Additionally, papers like LLMCup and AssertCoder have demonstrated the effectiveness of LLMs in comment updating and assertion generation tasks.

Sources

RTNinja: a generalized machine learning framework for analyzing random telegraph noise signals in nanoelectronic devices

White-Basilisk: A Hybrid Model for Code Vulnerability Detection

NL in the Middle: Code Translation with LLMs and Intermediate Representations

LLMCup: Ranking-Enhanced Comment Updating with LLMs

OpenCodeReasoning-II: A Simple Test Time Scaling Approach via Self-Critique

Position Paper: Programming Language Techniques for Bridging LLM Code Generation Semantic Gaps

When Developer Aid Becomes Security Debt: A Systematic Analysis of Insecure Behaviors in LLM Coding Agents

A Mixture of Linear Corrections Generates Secure Code

It Only Gets Worse: Revisiting DL-Based Vulnerability Detectors from a Practical Perspective

SimStep: Chain-of-Abstractions for Incremental Specification and Debugging of AI-Generated Interactive Simulations

Is Quantization a Deal-breaker? Empirical Insights from Large Code Models

Efficient FRW Transitions via Stochastic Finite Differences for Handling Non-Stratified Dielectrics

Iceberg: Enhancing HLS Modeling with Synthetic Data

Explicit Vulnerability Generation with LLMs: An Investigation Beyond Adversarial Attacks

ReDemon UI: Reactive Synthesis by Demonstration for Web UI

Breaking the Myth: Can Small Models Infer Postconditions Too?

AssertCoder: LLM-Based Assertion Generation via Multimodal Specification Extraction

CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks

ARPaCCino: An Agentic-RAG for Policy as Code Compliance

DALI-PD: Diffusion-based Synthetic Layout Heatmap Generation for ML in Physical Design

A Code Comprehension Benchmark for Large Language Models for Code

Toward Realistic Evaluations of Just-In-Time Vulnerability Prediction

MalCodeAI: Autonomous Vulnerability Detection and Remediation via Language Agnostic Code Reasoning

Evaluating Generated Commit Messages with Large Language Models

Function-to-Style Guidance of LLMs for Code Translation

REVA: Supporting LLM-Generated Programming Feedback Validation at Scale Through User Attention-based Adaptation

MetaLint: Generalizable Idiomatic Code Quality Analysis through Instruction-Following and Easy-to-Hard Generalization

LLAMA: Multi-Feedback Smart Contract Fuzzing Framework with LLM-Guided Seed Generation

Chain-of-Descriptions: Improving Code LLMs for VHDL Code Generation and Summarization

When Retriever Meets Generator: A Joint Model for Code Comment Generation

Investigating the Performance of Small Language Models in Detecting Test Smells in Manual Test Cases

Towards Formal Verification of LLM-Generated Code from Natural Language Prompts