Page 336 - 《软件学报》2025年第10期
P. 336

张云婷 等: 中文对抗攻击下的        ChatGPT  鲁棒性评估                                           4733


                  [8]   Alzantot M, Sharma Y, Elgohary A, Ho BJ, Srivastava M, Chang KW. Generating natural language adversarial examples. In: Proc. of the
                     2018  Conf.  on  Empirical  Methods  in  Natural  Language  Processing.  Brussels:  Association  for  Computational  Linguistics,  2018.
                     2890–2896. [doi: 10.18653/v1/D18-1316]
                  [9]   Zang Y, Qi FC, Yang CH, Liu ZY, Zhang M, Liu Q, Sun MS. Word-level textual adversarial attacking as combinatorial optimization. In:
                     Proc. of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2020.
                     6066–6080. [doi: 10.18653/v1/2020.acl-main.540]
                 [10]   Jin  D,  Jin  ZJ,  Zhou  JT,  Szolovits  P.  Is  BERT  really  robust?  A  strong  baseline  for  natural  language  attack  on  text  classification  and
                     entailment. In: Proc. of the 34th AAAI Conf. on Artificial Intelligence. New York: AAAI Press, 2020. 8018–8025. [doi: 10.1609/aaai.
                     v34i05.6311]
                 [11]   Li LY, Ma RT, Guo QP, Xue XY, Qiu XP. BERT-ATTACK: Adversarial attack against BERT using BERT. In: Proc. of the 2020 Conf.
                     on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2020. 6193–6202. [doi: 10.18653/v1/
                     2020.emnlp-main.500]
                 [12]   Ren SH, Deng YH, He K, Che WX. Generating natural language adversarial examples through probability weighted word saliency. In:
                     Proc. of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics,
                     2019. 1085–1097. [doi: 10.18653/v1/P19-1103]
                 [13]   Xu  JC,  Du  QF.  Adversarial  attacks  on  text  classification  models  using  layer-wise  relevance  propagation.  Int’l  Journal  of  Intelligent
                     Systems, 2020, 35(9): 1397–1415. [doi: 10.1002/int.22260]
                 [14]   Li JF, Ji SL, Du TY, Li B, Wang T. TextBugger: Generating adversarial text against real-world applications. In: Proc. of the 26th Annual
                     Network and Distributed System Security Symp. San Diego: The Internet Society, 2019. [doi: 10.14722/ndss.2019.23138]
                 [15]   Garg  S,  Ramakrishnan  G.  BAE:  BERT-based  adversarial  examples  for  text  classification.  In:  Proc.  of  the  2020  Conf.  on  Empirical
                     Methods in Natural Language Processing. Association for Computational Linguistics, 2020. 6174–6181. [doi: 10.18653/v1/2020.emnlp-
                     main.498]
                 [16]   Li DQ, Zhang YZ, Peng H, Chen LQ, Brockett C, Sun MT, Dolan B. Contextualized perturbation for textual adversarial attack. In: Proc.
                     of the 2021 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
                     Association for Computational Linguistics, 2021. 5053–5069. [doi: 10.18653/v1/2021.naacl-main.400]
                 [17]   Zhang ZH, Liu MX, Zhang C, Zhang YM, Li Z, Li Q, Duan HX, Sun DH. Argot: Generating adversarial readable Chinese texts. In: Proc.
                     of the 29th Int’l Joint Conf. on Artificial Intelligence. Yokohama, 2021. 2533–2539. [doi: 10.24963/ijcai.2020/351]
                 [18]   Cheng N, Chang GQ, Gao HC, Pei G, Zhang Y. WordChange: Adversarial examples generation approach for Chinese text classification.
                     IEEE Access, 2020, 8: 79561–79572. [doi: 10.1109/ACCESS.2020.2988786]
                 [19]   Tong X, Wang LN, Wang RZ, Wang JY. A generation method of word-level adversarial samples for Chinese text classification. Netinfo
                     Security, 2020, 20(9): 12–16 (in Chinese with English abstract). [doi: 10.3969/j.issn.1671-1122.2020.09.003]
                 [20]   Ou HX, Yu L, Tian SW, Chen X. Chinese adversarial examples generation approach with multi-strategy based on semantic. Knowledge
                     and Information Systems, 2022, 64(4): 1101–1119. [doi: 10.1007/s10115-022-01652-1]
                 [21]   Zhang YT, Ye L, Tang HL, Zhang HL, Li S. Chinese BERT attack method based on masked language model. Ruan Jian Xue Bao/Journal
                     of Software, 2024, 35(7): 3392–3409 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6932.htm [doi: 10.13328/j.cnki.
                     jos.006932]
                 [22]   He XL, Lyu LJ, Sun LC, Xu QK. Model extraction and adversarial transferability, your BERT is vulnerable! In: Proc. of the 2021 Conf.
                     of  the  North  American  Chapter  of  the  Association  for  Computational  Linguistics:  Human  Language  Technologies.  Association  for
                     Computational Linguistics, 2021. 2006–2012. [doi: 10.18653/v1/2021.naacl-main.161]
                 [23]   Ebrahimi J, Rao AY, Lowd D, Dou DJ. HotFlip: White-box adversarial examples for text classification. In: Proc. of the 56th Annual
                     Meeting of the Association for Computational Linguistics, Vol. 2: Short Papers. Melbourne: Association for Computational Linguistics,
                     2018. 31–36. [doi: 10.18653/v1/P18-2006]
                 [24]   Shi YC, Han YH. Metric system and its completeness of adversarial robustness evaluation. Ruan Jian Xue Bao/Journal of Software, 2025,
                     36(3): 1304–1326 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/7172.htm [doi: 10.13328/j.cnki.jos.007172]
                 [25]   Yoo JY, Morris JX, Lifland E, Qi YJ. Searching for a search method: Benchmarking search algorithms for generating NLP adversarial
                     examples.  In:  Proc.  of  the  3rd  BlackboxNLP  Workshop  on  Analyzing  and  Interpreting  Neural  Networks  for  NLP.  Association  for
                     Computational Linguistics, 2020. 323–332. [doi: 10.18653/v1/2020.blackboxnlp-1.30]
                 [26]   Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional Transformers for language understanding. In: Proc.
                     of the 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1:
                     Long and Short Papers. Minneapolis: Association for Computational Linguistics, 2019. 4171–4186. [doi: 10.18653/v1/N19-1423]
   331   332   333   334   335   336   337   338   339   340   341