Page 83 - 《软件学报》2025年第4期

P. 83

软件学报 ISSN 1000-9825, CODEN RUXUEW E-mail: jos@iscas.ac.cn
2025,36(4):1489−1529 [doi: 10.13328/j.cnki.jos.007268] [CSTR: 32375.14.jos.007268] http://www.jos.org.cn
©中国科学院软件研究所版权所有. Tel: +86-10-62562563

*
大模型在软件缺陷检测与修复的应用发展综述

香佳宏 1,2 , 徐霄阳 1,2 , 孔繁初 1,2 , 彭湃 3 , 张钊 3 , 张煜群 1,2

1
(南方科技大学斯发基斯可信自主系统研究院, 广东深圳 518055)
2
(南方科技大学计算机科学与工程系, 广东深圳 518055)
3
(深圳艾提亚科技有限公司, 广东深圳 518067)
通信作者: 张煜群, E-mail: zhangyq@sustech.edu.cn

摘要: 随着信息化的深入, 大量应用程序的开发和功能迭代不可避免引入软件缺陷, 并潜在地对程序可靠性和安
全性造成了严重的威胁. 检测与修复软件漏洞, 已经成为开发者维护软件质量必要的任务, 同时也是沉重的负担.
对此, 软件工程的研究者在过去的数十年中提出大量相关技术, 帮助开发者解决缺陷相关问题. 然而这些技术都面
leads to software defects, posing significant threats to program reliability and security. Therefore, detecting and repairing software defects
对着一些严峻的挑战, 在工业实践落地上鲜有进展. 大模型, 如代码大模型 CodeX 和对话大模型 ChatGPT, 通过在
海量数据集上进行训练, 能够捕捉代码中的复杂模式和结构, 处理大量上下文信息并灵活地适应各种任务, 以其优
秀的性能吸引了大量研究人员的关注. 在诸多软件工程任务中, 基于大模型的技术展现出显著的优势, 有望解决不
同领域过去所面对的关键挑战. 因此, 尝试对目前已经存在基于大模型相关成熟技术的 3 个缺陷检测领域: 深度学
习库的缺陷检测、GUI 自动化测试、测试用例的自动生成, 与软件缺陷修复的成熟领域: 缺陷自动化修复, 进行分
析和探究, 在阐述其发展脉络的同时对不同技术流派的特性和挑战进行深入的探讨. 最后, 基于对已有研究的分析,
总结这些领域和技术所面临的关键挑战及对未来研究的启示.
关键词: 大模型; 缺陷检测; 深度学习库缺陷检测; 测试用例自动生成; GUI 自动化测试; 缺陷自动修复
中图法分类号: TP311

中文引用格式: 香佳宏, 徐霄阳, 孔繁初, 彭湃, 张钊, 张煜群. 大模型在软件缺陷检测与修复的应用发展综述. 软件学报, 2025,
36(4): 1489–1529. http://www.jos.org.cn/1000-9825/7268.htm
英文引用格式: Xiang JH, Xu XY, Kong FC, Peng P, Zhang Z, Zhang YQ. Survey on Application and Development of Large Language
Models in Software Defect Detection and Repair. Ruan Jian Xue Bao/Journal of Software, 2025, 36(4): 1489–1529 (in Chinese). http://
www.jos.org.cn/1000-9825/7268.htm

Survey on Application and Development of Large Language Models in Software Defect
Detection and Repair
1,2 1,2 1,2 3 3 1,2
XIANG Jia-Hong , XU Xiao-Yang , KONG Fan-Chu , PENG Pai , ZHANG Zhao , ZHANG Yu-Qun
1
(Research Institute of Trustworthy Autonomous Systems, Southern University of Science and Technology, Shenzhen 518055, China)
2
(Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China)
3
(ITEA Technologies Co. Ltd., Shenzhen 518067, China)
Abstract: With the advancement of informationalization, the development of a variety of applications and iterative functions inevitably

becomes essential yet onerous for developers in maintaining software quality. Accordingly, software engineering researchers have proposed
numerous technologies over the past decades to help developers address defect-related issues. However, these technologies face serious
challenges and make little progress in industrial implementation. Large language model (LLM), such as the code-based model CodeX and

* 基金项目: 国家自然科学基金 (62372220)
收稿时间: 2023-12-06; 修改时间: 2024-05-18, 2024-07-09; 采用时间: 2024-07-28; jos 在线出版时间: 2025-01-08
CNKI 网络首发时间: 2025-01-15

78 79 80 81 82 83 84 85 86 87 88