Page 380 - 《软件学报》2025年第7期
P. 380
高梦楠 等: 面向深度学习的后门攻击及防御研究综述 3301
Proc. of the 30th Annual Network and Distributed System Security Symp. San Diego: Internet Society, 2023. [doi: 10.14722/ndss.2023.
23069]
[104] Gao YS, Xu CG, Wang DR, Chen SP, Ranasinghe DC, Nepal S. STRIP: A defence against Trojan attacks on deep neural networks. In:
Proc. of the 35th Annual Computer Security Applications Conf. San Juan: ACM, 2019. 113–125. [doi: 10.1145/3359789.3359790]
[105] Chou E, Tramer F, Pellegrino G. SentiNet: Detecting localized universal attacks against deep learning systems. In: Proc. of the 2020
IEEE Security and Privacy Workshops. San Francisco: IEEE, 2020. 48–54. [doi: 10.1109/SPW50608.2020.00025]
[106] Doan BG, Abbasnejad E, Ranasinghe DC. Februus: Input purification defense against Trojan attacks on deep neural network systems.
In: Proc. of the 36th Annual Computer Security Applications Conf. Austin: ACM, 2020. 897–912. [doi: 10.1145/3427228.3427264]
[107] Qi XY, Xie TH, Wang JT, Wu T, Mahloujifar S, Mittal P. Towards a proactive ML approach for detecting backdoor poison samples. In:
Proc. of the 32nd USENIX Security Symp. Anaheim: USENIX Association, 2023. 1685–1702.
[108] Li YG, Lyu XX, Koren N, Lyu LJ, Li B, Ma XJ. Anti-backdoor learning: Training clean models on poisoned data. In: Proc. of the 35th
Conf. on Neural Information Processing Systems. Curran Associates Inc., 2021. 14900–14912.
[109] Huang KZ, Li YM, Wu BY, Qin Z, Ren K. Backdoor defense via decoupling the training process. arXiv:2202.03423, 2022.
[110] Gao KF, Bai Y, Gu JD, Yang Y, Xia ST. Backdoor defense via adaptively splitting poisoned dataset. In: Proc. of the 2023 IEEE/CVF
Conf. on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023. 4005–4014. [doi: 10.1109/CVPR52729.2023.00390]
[111] Zhang ZX, Liu Q, Wang ZC, Lu ZP, Hu QY. Backdoor defense via deconfounded representation learning. In: Proc. of the IEEE/CVF
Conf. on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023. 12228–12238. [doi: 10.1109/CVPR52729.2023.01177]
[112] Qi FC, Chen YY, Li MK, Yao Y, Liu ZY, Sun MS. ONION: A simple and effective defense against textual backdoor attacks. In: Proc.
of the 2021 Conf. on Empirical Methods in Natural Language Processing. ACL, 2021. 9558–9566. [doi: 10.18653/v1/2021.emnlp-main.
752]
[113] Chen CS, Dai JZ. Mitigating backdoor attacks in LSTM-based text classification systems by backdoor keyword identification.
Neurocomputing, 2021, 452: 253–262. [doi: 10.1016/j.neucom.2021.04.105]
[114] Yang WK, Lin YK, Li P, Zhou J, Sun X. RAP: Robustness-aware perturbations for defending against backdoor attacks on nlp models.
In: Proc. of the 2021 Conf. on Empirical Methods in Natural Language Processing. ACL, 2021. 8365–8381. [doi: 10.18653/v1/2021.
emnlp-main.659]
[115] Sabir B, Babar MA, Abuadbba S. Interpretability and transparency-driven detection and transformation of textual adversarial examples
(IT-DT). arXiv:2307.01223, 2023.
[116] Pei HZ, Jia JY, Guo WB, Li B, Song D. TextGuard: Provable defense against backdoor attacks on text classification. In: Proc. of the
31st Annual Network and Distributed System Security Symp. San Diego: Internet Society, 2024. [doi: 10.14722/ndss.2024.24090]
[117] Wang BL, Yao YS, Shan S, Li HY, Viswanath B, Zheng HT, Zhao BY. Neural Cleanse: Identifying and mitigating backdoor attacks in
neural networks. In: Proc. of the 2019 IEEE Symp. on Security and Privacy. San Francisco: IEEE, 2019. 707–723. [doi: 10.1109/SP.
2019.00031]
[118] Guo WB, Wang L, Xing XY, Du M, Song D. TABOR: A highly accurate approach to inspecting and restoring Trojan backdoors in AI
systems. arXiv:1908.01763, 2019.
[119] Wang R, Zhang GY, Liu SJ, Chen PY, Xiong JJ, Wang M. Practical detection of Trojan neural networks: Data-limited and data-free
cases. In: Proc. of the 16th European Conf. on Computer Vision (ECCV). Glasgow: Springer, 2020. 222–238. [doi: 10.1007/978-3-030-
58592-1_14]
[120] Liu YQ, Lee WC, Tao GH, Ma SQ, Aafer Y, Zhang XY. ABS: Scanning neural networks for back-doors by artificial brain stimulation.
In: Proc. of the 2019 ACM SIGSAC Conf. on Computer and Communications Security. London: ACM, 2019. 1265–1282. [doi: 10.1145/
3319535.3363216]
[121] Liu K, Dolan-Gavitt B, Garg S. Fine-pruning: Defending against backdooring attacks on deep neural networks. In: Proc. of the 21st Int’l
Symp. on Research in Attacks, Intrusions, and Defenses. Heraklion: Springer, 2018. 273–294. [doi: 10.1007/978-3-030-00470-5_13]
[122] Wu DX, Wang YS. Adversarial neuron pruning purifies backdoored deep models. In: Proc. of the 35th Int’l Conf. on Neural Information
Processing Systems. Curran Associates Inc., 2021. 16913–16925.
[123] Hong S, Chandrasekaran V, Kaya Y, Dumitraş T, Papernot N. On the effectiveness of mitigating data poisoning attacks with gradient
shaping. arXiv:2002.11497, 2020.
[124] Du M, Jia RX, Song D. Robust anomaly detection and backdoor attack detection via differential privacy. arXiv:1911.07116, 2019.
[125] Azizi A, Tahmid IA, Waheed A, Mangaokar N, Pu JM, Javed M, Reddy CK, Viswanath B. T-Miner: A generative approach to defend
against Trojan attacks on DNN-based text classification. In: Proc. of the 30th USENIX Security Symp. USENIX Association, 2021.
2255–2272.
[126] Shen GY, Liu YQ, Tao GH, Xu QL, Zhang Z, An SW, Ma SQ, Zhang XY. Constrained optimization with dynamic bound-scaling for

