Page 115 - 《软件学报》2025年第4期
P. 115
香佳宏 等: 大模型在软件缺陷检测与修复的应用发展综述 1521
献进行了深入的分析和探究. 在阐述发展脉络的同时, 我们尝试对不同技术流派所面对的挑战进行总结分析. 本文
发现, 很多传统技术需要人工专家定制化设计调整策略, 并且其效果容易受到干扰, 有效性受到严峻的挑战. 后续
研究者普遍开始尝试应用最新的深度学习和强化学习技术到这些领域中, 并有出色的进展. 然而这些技术所依赖
的数据集数量和质量都难以保证. 基于此, 软件工程的研究者迫切需要一种更加强大、更加智能的技术解决复杂
程序中的诸多挑战, 而大模型的提出为解决这些挑战带来了新的思路和方向. 诸多研究者将大模型应用到这 4 个
领域中, 获得了优异的效果. 但是基于大模型的技术仍面对诸多挑战, 等待研究者进一步探究.
综上, 本文对基于大模型的 4 个软件工程缺陷相关领域进行了详细分析和介绍, 并探讨了基于大模型技术的
挑战与机遇. 我们相信在科技创新日新月异的今天, 把握最新技术的动向并应用于软件工程中, 可以进一步推动不
同领域的发展, 并为实践中的工业落地提供思路和方向.
References:
[1] Gazzola L, Micucci D, Mariani L. Automatic software repair: A survey. In: Proc. of the 40th Int’l Conf. on Software Engineering.
Gothenburg: ACM, 2018. 1219. [doi: 10.1145/3180155.3182526]
[2] Monperrus M. Automatic software repair: A bibliography. ACM Computing Surveys, 2018, 51(1): 17. [doi: 10.1145/3105906]
[3] Ayewah N, Pugh W, Hovemeyer D, Morgenthaler JD, Penix J. Using static analysis to find bugs. IEEE Software, 2008, 25(5): 22–29.
[doi: 10.1109/MS.2008.130]
[4] Cole B, Hakim D, Hovemeyer D, Lazarus R, Pugh W, Stephens K. Improving your software using static analysis to find bugs. In: Proc.
of the 21st ACM SIGPLAN Symp. on Object-oriented Programming Systems and Applications. Portland: ACM, 2006. 673–674. [doi:
10.1145/1176617.1176667]
[5] Pacheco C, Ernst MD. Randoop: Feedback-directed random testing for Java. In: Proc. of the 22nd ACM SIGPLAN Conf. on Object-
oriented Programming Systems and Applications Companion. Montreal: ACM, 2007. 815–816. [doi: 10.1145/1297846.1297902]
[6] Fraser G, Arcuri A. EvoSuite: Automatic test suite generation for object-oriented software. In: Proc. of the 19th ACM SIGSOFT Symp.
and the 13th European Conf. on Foundations of Software Engineering. Szeged: ACM, 2011. 416–419. [doi: 10.1145/2025113.2025179]
[7] DeMarco F, Xuan JF, Le Berre D, Monperrus M. Automatic repair of buggy if conditions and missing preconditions with SMT. In:
Proc. of the 6th Int’l Workshop on Constraints in Software Testing, Verification, and Analysis. Hyderabad: ACM, 2014. 30–39. [doi: 10.
1145/2593735.2593740]
[8] insightcivic.s3.us-east-1.amazonaws.com/language-models.pdf
Chen LS, Pei Y, Furia CA. Contract-based program repair without the contracts. In: Proc. of the 32nd IEEE/ACM Int’l Conf. on
Automated Software Engineering (ASE). Urbana: IEEE, 2017. 637–647. [doi: 10.1109/ASE.2017.8115674]
[9] Wu YH, Jiang AQ, Li WD, Rabe MN, Staats C, Jamnik M, Szegedy C. Autoformalization with large language models. In: Proc. of the
36th Int’l Conf. on Neural Information Processing Systems. New Orleans: Curran Associates Inc., 2022. 32353–32368.
[10] First E, Rabe MN, Ringer T, Brun Y. Baldur: Whole-proof generation and repair with large language models. arXiv:2303.04910, 2023.
[11] Kim D, Nam J, Song J, Kim S. Automatic patch generation learned from human-written patches. In: Proc. of the 35th Int’l Conf. on
Software Engineering (ICSE). San Francisco: IEEE, 2013. 802–811. [doi: 10.1109/ICSE.2013.6606626]
[12] Hua JR, Zhang MS, Wang KY, Khurshid S. Towards practical program repair with on-demand candidate generation. In: Proc. of the
40th Int’l Conf. on Software Engineering. Gothenburg: ACM, 2018. 12–23. [doi: 10.1145/3180155.3180245]
[13] Radford A, Narasimhan K, Salimans T, Sutskever I. Improving language understanding by generative pre-training. 2018. https://hayate-
lab.com/wp-content/uploads/2023/05/43372bfa750340059ad87ac8e538c53b.pdf
[14] Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I. Language models are unsupervised multitask learners. 2019. https://
[15] Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional Transformers for language understanding.
arXiv:1810.04805, 2019.
[16] Liu YH, Ott M, Goyal N, Du JF, Joshi M, Chen DQ, Levy O, Lewis M, Zettlemoyer L, Stoyanov V. RoBERTa: A robustly optimized
BERT pretraining approach. arXiv:1907.11692, 2019.
[17] ICSE 2023. 2023. https://conf.researchr.org/home/icse-2023
[18] ISSTA 2023. 2023. https://conf.researchr.org/home/issta-2023
[19] ASE 2023. 2023. https://conf.researchr.org/home/ase-2023
[20] ESEC/FSE 2023. 2023. https://conf.researchr.org/home/fse-2023
[21] Deng YL, Xia CS, Yang CY, Zhang SD, Yang SJ, Zhang LM. Large language models are edge-case fuzzers: Testing deep learning