Page 97 - 《软件学报》2026年第1期
P. 97

94                                                         软件学报  2026  年第  37  卷第  1  期


                 动态环境适应性, 设计合理、效果优异的测试用例生成方法对于系统质量保障至关重要. 因此, 面向智能软件系统
                 的测试用例生成方法不仅具有良好的工程应用前景, 还在理论研究上展现出巨大的发展潜力, 预计将在未来的软
                 件工程等领域持续引领新的研究热点.

                 References
                  [1]   Apple Inc. Apple’s Siri. 2025. https://www.apple.com/siri/
                  [2]   Amazon.com, Inc. or its affiliates. Amazon’s cloud-based voice service. 2025. https://developer.amazon.com/en-US/alexa
                  [3]   OpenAI. ChatGPT. 2025. https://chat.openai.com/
                  [4]   Baidu. Baidu’s wenxin. 2025. https://yiyan.baidu.com/
                  [5]   Waymo LLC. Waymo. 2025. https://waymo.com/intl/zh-cn/
                  [6]   General Motors. Cruise self driving cars. 2025. https://getcruise.com/
                  [7]   Huang CX, Cai HN, Xu LD, Xu BY, Gu YZ, Jiang LH. Data-driven ontology generation and evolution towards intelligent service in
                      manufacturing systems. Future Generation Computer Systems, 2019, 101: 197–207. [doi: 10.1016/j.future.2019.05.075]
                  [8]   TED. Amazon alexa and devices division on pace to lose $10b. 2022. https://www.strata-gee.com/amazon-alexa-and-devices-division-
                      on-pace-to-lose-10b-div-in-crisis-mode/
                  [9]   Smith L. Israel police mistakenly arrest Palestinian man for writing ‘good morning’ on Facebook. 2017. https://www.independent.co.uk/
                      news/uk/home-news/israel-police-palestinian-man-arrest-good-morning-facebook-page-translation-mistake-a8015626.html
                 [10]   Thailand  has  threatened  to  take  legal  action  against  Facebook.  2020  (in  Chinese).  https://k.sina.cn/article_1949671172_
                      74359f0400100rsmq.html
                 [11]   Ribeiro MT, Wu TS, Guestrin C, Singh S. Beyond accuracy: Behavioral testing of NLP models with checklist. arXiv:2005.04118, 2020.
                 [12]   Rajpurkar P, Jia RB, Liang P. Know what you don’t know: Unanswerable questions for SQuAD. arXiv:1806.03822, 2018.
                 [13]   Recht  B,  Roelofs  R,  Schmidt  L,  Shankar  V.  Do  ImageNet  classifiers  generalize  to  ImageNet?  In:  Proc.  of  the  36th  Int’l  Conf.  on
                      Machine Learning. Long Beach: PMLR, 2019. 5389–5400.
                 [14]   Wu TS, Ribeiro MT, Heer J, Weld D. Errudite: Scalable, reproducible, and testable error analysis. In: Proc. of the 57th Annual Meeting
                      of the Association for Computational Linguistics. Florence: ACL, 2019. 747–763. [doi: 10.18653/v1/P19-1073]
                 [15]   Chen SQ, Jin S, Xie XY. Validation on machine reading comprehension software without annotated labels: A property-based method.
                      In: Proc. of the 29th ACM Joint Meeting European Software Engineering Conf. and Symp. on the Foundations of Software Engineering.
                      Athens: ACM, 2021. 590–602. [doi: 10.1145/3468264.3468569]
                 [16]   Zhang JM, Harman M, Ma L, Liu Y. Machine learning testing: Survey, landscapes and horizons. IEEE Trans. on Software Engineering,
                      2022, 48(1): 1–36. [doi: 10.1109/TSE.2019.2962027]
                 [17]   Xiang  WM,  Musau  P,  Wild  AA,  Lopez  DM,  Hamilton  N,  Yang  XD,  Rosenfeld  J,  Johnson  TT.  Verification  for  machine  learning,
                      autonomy, and neural networks survey. arXiv:1810.01989, 2019.
                 [18]   Huang XW, Kroening D, Ruan WJ, Sharp J, Sun YC, Thamo E, Wu M, Yi XP. A survey of safety and trustworthiness of deep neural
                      networks: Verification, testing, adversarial attack and defence, and interpretability. Computer Science Review, 2020, 37: 100270. [doi:
                      10.1016/j.cosrev.2020.100270]
                 [19]   Braiek HB, Khomh F. On testing machine learning programs. Journal of Systems and Software, 2020, 164: 110542. [doi: 10.1016/j.jss.
                      2020.110542]
                 [20]   Wang Z, Yan M, Liu S, Chen JJ, Zhang DD, Wu Z, Chen X. Survey on testing of deep neural networks. Ruan Jian Xue Bao/Journal of
                      Software, 2020, 31(5): 1255–1275 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5951.htm [doi: 10.13328/j.cnki.
                      jos.005951]
                 [21]   Vishnukumar HJ, Butting B, Müller C, Sax E. Machine learning and deep neural network—Artificial intelligence core for lab and real-
                      world test and validation for ADAS and autonomous vehicles: AI for efficient and quality test and validation. In: Proc. of the 2017
                      Intelligent Systems Conf. (IntelliSys). London: IEEE, 2017. 714–721. [doi: 10.1109/IntelliSys.2017.8324372]
                 [22]   Pannu A. Artificial intelligence and its application in different areas. Artificial Intelligence, 2015, 4(10): 79–84.
                 [23]   Dabre R, Chu CH, Kunchukuttan A. A survey of multilingual neural machine translation. ACM Computing Surveys, 2020, 53(5): 99.
                      [doi: 10.1145/3406095]
                 [24]   Malik M, Malik MK, Mehmood K, Makhdoom I. Automatic speech recognition: A survey. Multimedia Tools and Applications, 2021,
                      80(6): 9411–9457. [doi: 10.1007/s11042-020-10073-7]
                 [25]   Adamopoulou  E,  Moussiades  L.  An  overview  of  chatbot  technology.  In:  Proc.  of  the  16th  IFIP  WG  12.5  Int’l  Conf.  on  Artificial
   92   93   94   95   96   97   98   99   100   101   102