Page 373 - 《软件学报》2024年第6期

P. 373

周志阳等: 谛听: 面向鲁棒分布外样本检测的半监督对抗训练方法 2949

86523-8_26]
[17] Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R. Intriguing properties of neural networks. arXiv:
1312.6199, 2014.
[18] Goodfellow IJ, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. arXiv:1412.6572, 2015.
[19] Xu H, Ma Y, Liu HC, Deb D, Liu H, Tang JL, Jain AK. Adversarial attacks and defenses in images, graphs and text: A review. Int’l
Journal of Automation and Computing, 2020, 17(2): 151–178. [doi: 10.1007/s11633-019-1211-x]
[20] Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. Towards deep learning models resistant to adversarial attacks. arXiv:1706.06083,
2019.
[21] Zhang HY, Yu YD, Jiao JT, Xing E, El Ghaoui L, Jordan MI. Theoretically principled trade-off between robustness and accuracy. In:
Proc. of the 36th Int’l Conf. on Machine Learning. Long Beach: PMLR, 2019. 7472–7482.
[22] Tsipras D, Santurkar S, Engstrom L, Turner A, Madry A. Robustness may be at odds with accuracy. arXiv:1805.12152, 2019.
[23] Bai Y, Feng Y, Wang YS, Dai T, Xia ST, Jiang Y. Hilbert-based generative defense for adversarial examples. In: Proc. of the 2019
IEEE/CVF Int’l Conf. on Computer Vision (ICCV). Seoul: IEEE, 2019. 4783–4792. [doi: 10.1109/ICCV.2019.00488]
[24] Ma XJ, Li B, Wang YS, Erfani SM, Wijewickrema S, Schoenebeck G, Song D, Houle ME, Bailey J. Characterizing adversarial subspaces
using local intrinsic dimensionality. arXiv:1801.02613, 2018.

[25] Xu WL, Evans D, Qi YJ. Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv:1704.01155, 2017.
[26] Grosse K, Manoharan P, Papernot N, Backes M, McDaniel P. On the (statistical) detection of adversarial examples. arXiv:1702.06280,
2017.
[27] Carlini N, Wagner D. Adversarial examples are not easily detected: Bypassing ten detection methods. In: Proc. of the 10th ACM
Workshop on Artificial Intelligence and Security. Dallas: ACM, 2017. 3–14. [doi: 10.1145/3128572.3140444]
[28] Croce F, Hein M. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: Proc. of the 37th
Int’l Conf. on Machine Learning. Virtual Event: JMLR.org, 2020. 2206–2216.
[29] Tramèr F, Kurakin A, Papernot N, Goodfellow I, Boneh D, McDaniel P. Ensemble adversarial training: Attacks and defenses.
arXiv:1705.07204, 2020.
[30] Moosavi-Dezfooli SM, Fawzi A, Fawzi O, Frossard P. Universal adversarial perturbations. In: Proc. of the 2017 IEEE Conf. on Computer
Vision and Pattern Recognition. Honolulu: IEEE, 2017. 86–94. [doi: 10.1109/CVPR.2017.17]
[31] Moosavi-Dezfooli SM, Fawzi A, Frossard P. DeepFool: A simple and accurate method to fool deep neural networks. In: Proc. of the 2016
IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016. 2574–2582. [doi: 10.1109/CVPR.2016.282]
[32] Kurakin A, Goodfellow I, Bengio S. Adversarial examples in the physical world. arXiv:1607.02533, 2017.
[33] Papernot N, McDaniel P, Goodfellow I, Jha S, Celik ZB, Swami A. Practical black-box attacks against machine learning. In: Proc. of the
2017 ACM on Asia Conf. on Computer and Communications Security. Abu Dhabi: ACM, 2017. 506 –519. [doi: 10.1145/3052973.
3053009]
[34] Papernot N, McDaniel P, Jha S, Fredrikson M, Celik ZB, Swami A. The limitations of deep learning in adversarial settings. In: Proc. of
the 2016 IEEE European Symp. on Security and Privacy (EuroS&P). Saarbruecken: IEEE, 2016. 372–387. [doi: 10.1109/EuroSP.2016.
36]
[35] Pan WW, Wang XY, Song ML, Chen C. Survey on generating adversarial examples. Ruan Jian Xue Bao/Journal of Software, 2020,
31(1): 67–81 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5884.htm [doi: 10.13328/j.cnki.jos.005884]
[36] Tramèr F, Carlini N, Brendel W, Madry A. On adaptive attacks to adversarial example defenses. In: Proc. of the 34th Int’l Conf. on
Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2020. 1633–1645.
[37] Kurakin A, Goodfellow I, Bengio S. Adversarial machine learning at scale. arXiv:1611.01236, 2017.
[38] Athalye A, Carlini N, Wagner D. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In:
Proc. of the 35th Int’l Conf. on Machine Learning. Stockholm: PMLR, 2018. 274–283.
[39] Papernot N, McDaniel P, Wu X, Jha S, Swami A. Distillation as a defense to adversarial perturbations against deep neural networks. In:
Proc. of the 2016 IEEE Symp. on Security and Privacy (SP). San Jose: IEEE, 2016. 582–597. [doi: 10.1109/SP.2016.41]
[40] Carlini N, Wagner D. Towards evaluating the robustness of neural networks. In: Proc. of the 2017 IEEE Symp. on Security and Privacy
(SP). San Jose: IEEE, 2017. 39–57. [doi: 10.1109/SP.2017.49]
[41] Gowal S, Uesato J, Qin CL, Huang PS, Mann T, Kohli P. An alternative surrogate loss for PGD-based adversarial testing.
arXiv:1910.09338, 2019.
[42] Gowal S, Qin CL, Uesato J, Mann T, Kohli P. Uncovering the limits of adversarial training against norm-bounded adversarial examples.
arXiv:2010.03593, 2021.

368 369 370 371 372 373 374 375 376 377 378