编写自己的代码检查器:基于LLMs的自动化测试驱动检查器开发方法
Write Your Own CodeChecker: An Automated Test-Driven Checker Development Approach with LLMs
摘要 Abstract
随着对代码质量保证需求的增长,开发者不仅在利用现有的静态代码检查器,还寻求定制化的检查器以满足特定需求。目前,各种代码检查框架提供了广泛的检查器定制接口以满足这一需求。然而,大规模检查器框架的抽象检查逻辑以及复杂的API使用使得这项任务具有挑战性。为此,人们期望实现自动化的代码检查器生成以减轻检查器开发的负担。本文提出了一种名为AutoChecker的创新方法,该方法基于规则描述和测试套件,通过大型语言模型(LLMs)自动生成代码检查器。为了实现全面的检查逻辑,AutoChecker每次专注于解决一个选定的案例,并逐步更新检查器逻辑。为了获取精确的API知识,在每次迭代中,它利用细粒度的逻辑引导的API上下文检索,即将检查逻辑分解为一系列子操作,并为每个子操作检索相关的API上下文。在评估方面,我们使用多个LLMs,应用AutoChecker、五个基线方法以及三种消融方法,针对从PMD规则集中随机选择的20条规则生成检查器。实验结果显示,AutoChecker在所有有效性指标上均显著优于其他方法,平均测试通过率为82.28%。此外,由AutoChecker生成的检查器能够成功应用于实际项目,其性能可媲官方检查器。
With the rising demand for code quality assurance, developers are not only utilizing existing static code checkers but also seeking custom checkers to satisfy their specific needs. Nowadays, various code-checking frameworks provide extensive checker customization interfaces to meet this need. However, both the abstract checking logic and the complex API usage of large-scale checker frameworks make this task challenging. To this end, automated code checker generation is anticipated to ease the burden of checker development. In this paper, we propose AutoChecker, an innovative LLM-powered approach that can write code checkers automatically based on only a rule description and a test suite. To achieve comprehensive checking logic, AutoChecker incrementally updates the checker's logic by focusing on solving one selected case each time. To obtain precise API knowledge, during each iteration, it leverages fine-grained logic-guided API-context retrieval, where it first decomposes the checking logic into a series of sub-operations and then retrieves checker-related API-contexts for each sub-operation. For evaluation, we apply AutoChecker, five baselines, and three ablation methods using multiple LLMs to generate checkers for 20 randomly selected PMD rules. Experimental results show that AutoChecker significantly outperforms others across all effectiveness metrics, with an average test pass rate of 82.28%. Additionally, the checkers generated by AutoChecker can be successfully applied to real-world projects, matching the performance of official checkers.