Page 79 - 《软件学报》2025年第4期

P. 79

孙伟松等: 深度代码模型安全综述 1485

[19] Henkel J, Ramakrishnan G, Wang Z, Albarghouthi A, Jha S, Reps T. Semantic robustness of models of source code. In: Proc. of the 2022
IEEE Int’l Conf. on Software Analysis, Evolution and Reengineering. Honolulu: IEEE, 2022. 526–537. [doi: 10.1109/SANER
53432.2022.00070]
[20] Zhu R, Zhang CM. How robust is a large pre-trained language model for code generation? A case on attacking GPT2. In: Proc. of the
2023 IEEE Int’l Conf. on Software Analysis, Evolution and Reengineering. Macao: IEEE, 2023. 708–712. [doi: 10.1109/SANER56733.
2023.00076]
[21] Zhang HZ, Fu ZY, Li G, Ma L, Zhao ZZ, Yang HA, Sun YZ, Liu Y, Jin Z. Towards robustness of deep program processing
models—Detection, estimation, and enhancement. ACM Trans. on Software Engineering and Methodology, 2022, 31(3): 50. [doi: 10.
1145/3511887]
[22] Zhou Y, Zhang XQ, Shen JJ, Han TT, Chen TL, Gall H. Adversarial robustness of deep code comment generation. ACM Trans. on
Software Engineering and Methodology, 2022, 31(4): 60. [doi: 10.1145/3501256]
[23] Quiring E, Maier A, Rieck K. Misleading authorship attribution of source code using adversarial learning. In: Proc. of the 28th USENIX
Conf. on Security Symp. Santa Clara: USENIX Association, 2019. 479–496.
[24] Severi G, Meyer J, Coull S, Oprea A. Explanation-guided backdoor poisoning attacks against malware classifiers. In: Proc. of the 30th
USENIX Security Symp. Vancouver: USENIX Association, 2021. 1487–1504.
[25] Zhang Z, Tao GH, Shen GY, An SW, Xu QL, Liu YQ, Ye YP, Wu YX, Zhang XY. PELICAN: Exploiting backdoors of naturally trained

deep learning models in binary code analysis. In: Proc. of the 32nd USENIX Security Symp. Anaheim: USENIX Association, 2023.
2365–2382.
[26] He JX, Vechev M. Large language models for code: Security hardening and adversarial testing. In: Proc. of the 2023 ACM SIGSAC
Conf. on Computer and Communications Security. Copenhagen: ACM, 2023. 1865–1879. [doi: 10.1145/3576915.3623175]
[27] Srikant S, Liu SJ, Mitrovska T, Chang SY, Fan QF, Zhang GY, O’Reilly UM. Generating adversarial computer programs using optimized
obfuscations. In: Proc. of the 9th Int’l Conf. on Learning Representations. OpenReview.net, 2021.
[28] Li YZ, Liu SQ, Chen KJ, Xie XF, Zhang TW, Liu Y. Multi-target backdoor attacks for code pre-trained models. In: Proc. of the 61st
Annual Meeting of the Association for Computational Linguistics (Vol. 1: Long Papers). Toronto: Association for Computational
Linguistics, 2023. 7236–7254. [doi: 10.18653/v1/2023.acl-long.399]
[29] Zhang HZ, Li Z, Li G, Ma L, Liu Y, Jin Z. Generating adversarial examples for holding robustness of source code processing models. In:
Proc. of the 34th AAAI Conf. on Artificial Intelligence. New York: AAAI Press, 2020. 1169–1176.
[30] Jha A, Reddy CK. CodeAttack: Code-based adversarial attacks for pre-trained programming language models. In: Proc. of the 37th AAAI
Conf. on Artificial Intelligence. Washington: AAAI Press, 2023. 14892–14900. [doi: 10.1609/aaai.v37i12.26739]
[31] Li YY, Wu HQ, Zhao H. Semantic-preserving adversarial code comprehension. In: Proc. of the 29th Int’l Conf. on Computational
Linguistics. Gyeongju: Int’l Committee on Computational Linguistics, 2022. 3017–3028.
[32] Yu XQ, Li Z, Huang X, Zhao SS. AdVulCode: Generating adversarial vulnerable code against deep learning-based vulnerability
detectors. Electronics, 2023, 12(4): 936. [doi: 10.3390/electronics12040936]
[33] Ramakrishnan G, Albarghouthi A. Backdoors in neural models of source code. arXiv:2006.06841, 2020.
[34] Springer JM, Reinstadler BM, O’Reilly UM. STRATA: Simple, gradient-free attacks for models of code. arXiv:2009.13562, 2021.
[35] Li J, Li Z, Zhang HZ, Li G, Jin Z, Hu X, Xia X. Poison attack and poison detection on deep source code processing models. ACM Trans.
on Software Engineering and Methodology, 2024, 33(3): 62. [doi: 10.1145/3630008]
[36] Qi SY, Yang YH, Gao S, Gao CY, Xu ZL. BadCS: A backdoor attack framework for code search. arXiv:2305.05503, 2023.
[37] Cotroneo D, Improta C, Liguori P, Natella R. Vulnerabilities in AI code generators: Exploring targeted data poisoning attacks.
arXiv:2308.04451, 2024.
[38] Yang Z, Xu BW, Zhang JM, Kang HJ, Shi JK, He JD, Lo D. Stealthy backdoor attack for code models. arXiv:2301.02496, 2023.
[39] Nguyen TD, Zhou Y, Le XBD, Thongtanunam P, Lo D. Adversarial attacks on code models with discriminative graph patterns.
arXiv:2308.11161, 2023.
[40] Zhang J, Ma W, Hu Q, Liu SQ, Xie XF, Traon YL, Liu Y. A black-box attack on code models via representation nearest neighbor search.
arXiv:2305.05896, 2023.
[41] Improta C, Liguori P, Natella R, Cukic B, Cotroneo D. Enhancing robustness of AI offensive code generators via data augmentation.
arXiv:2306.05079, 2023.
[42] Sun WS, Fang CR, Ge YF, Hu YL, Chen YC, Zhang QJ, Ge XT, Liu Y, Chen ZY. A survey of source code search: A 3-dimensional
perspective. arXiv:2311.07107, 2023.
[43] Mikolov T, Karafiát M, Burget L, Cernocký J, Khudanpur S. Recurrent neural network based language model. In: Proc. of the 11th
Annual Conf. of the Int’l Speech Communication Association. Makuhari: ISCA, 2010. 1045–1048. [doi: 10.21437/Interspeech.2010-343]

74 75 76 77 78 79 80 81 82 83 84