Advances in Code Generation and Translation

The field of code generation and translation is moving towards more efficient and accurate methods, with a focus on low-resource languages and automated debugging. Recent research has shown that curated, high-quality datasets can overcome limitations of smaller models, and that careful prompt engineering and prompt language choice can significantly improve translation quality. Additionally, rule-based debugging methods have been proposed to address the limitations of existing automated debugging methods. Notable papers include: TigerCoder, which introduces a novel suite of LLMs for code generation in Bangla, achieving significant performance gains over existing models. Targeted Test Selection Approach in Continuous Integration proposes a machine learning approach for industrial test selection, reducing execution time and accelerating the pipeline. Evaluating Large Language Models for Code Translation provides a systematic empirical assessment of state-of-the-art LLMs for code translation, demonstrating the value of careful prompt engineering and prompt language choice. RulER proposes a rule-based debugging method for code translation, outperforming state-of-the-art methods in error localization and repair success rates.

Sources

TigerCoder: A Novel Suite of LLMs for Code Generation in Bangla

Targeted Test Selection Approach in Continuous Integration

Evaluating Large Language Models for Code Translation: Effects of Prompt Language and Prompt Design

RulER: Automated Rule-Based Semantic Error Localization and Repair for Code Translation

Built with on top of