Page 188 - 《软件学报》2025年第10期
P. 188

李志强 等: SZZ  误标变更对移动      APP  即时缺陷预测性能和解释的影响                                    4585


                 一方面代码提交数量较多, 以确保实验中有足够的实例.

                  6   总结与展望

                    本文选取    GitHub  库中  17  个大型开源移动   APP  项目, 抽取了  Kamei 等人  [9] 提出的  14  个变更度量元, 并使用
                 B-SZZ、AG-SZZ、MA-SZZ   和  RA-SZZ  算法进行数据标注, 构造了      17  个移动  APP  数据集. 为探究不同    SZZ  算法
                 错误标注的变更对移动        APP  即时缺陷预测性能与解释的影响, 本文基于             4  种  SZZ  算法标注的数据集利用随机森
                 林、朴素贝叶斯和逻辑回归分类器分别建立即时缺陷预测模型, 采用                        AUC、MCC、G-mean、F-measure@20%
                 及  IFA  这  5  个指标进行评估, 并使用  SKESD  和  SHAP  算法对结果进行排序比较与解释分析. 在模型性能方面, B-
                 SZZ、AG-SZZ、MA-SZZ   算法会在不同程度上导致即时缺陷预测模型性能下降. 在模型解释方面, B-SZZ、AG-
                 SZZ、MA-SZZ  算法会影响模型预测过程中最重要的前              3  名度量元.
                    在未来的即时缺陷预测研究中, 本文推荐使用               RA-SZZ  算法对数据进行标注以构建缺陷预测数据集. 此外, 本
                 文的预测粒度为变更级, 相比于代码行, 粒度仍然较大, 因此后续工作拟研究基于代码行级的移动                              APP  即时缺陷
                 预测, 这将大幅度缩小开发人员需要人工审查的代码范围, 有助于进一步提升软件开发效率.

                 References:
                  [1]   Chen X, Gu Q, Liu WS, Liu SL, Ni C. Survey of static software defect prediction. Ruan Jian Xue Bao/Journal of Software, 2016, 27(1):
                     1–25 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4923.htm [doi: 10.13328/j.cnki.jos.004923]
                  [2]   Gong LN, Jiang SJ, Jiang L. Research progress of software defect prediction. Ruan Jian Xue Bao/Journal of Software, 2019, 30(10):
                     3090–3114 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5790.htm [doi: 10.13328/j.cnki.jos.005790]
                  [3]   Cai L, Fan YR, Yan M, Xia X. Just-in-time software defect prediction: Literature review. Ruan Jian Xue Bao/Journal of Software, 2019,
                     30(5): 1288–1307 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5713.htm [doi: 10.13328/j.cnki.jos.05713]
                  [4]   Li ZQ, Jing XY, Zhu XK. Progress on approaches to software defect prediction. IET Software, 2018, 12(3): 161–175. [doi: 10.1049/iet-
                     sen.2017.0148]
                  [5]   Fan YR, Xia X, Da Costa DA, Lo D, Hassan AE, Li SP. The impact of mislabeled changes by SZZ on just-in-time defect prediction.
                     IEEE Trans. on Software Engineering, 2021, 47(8): 1559–1586. [doi: 10.1109/TSE.2019.2929761]
                  [6]   Kamei Y, Fukushima T, McIntosh S, Yamashita K, Ubayashi N, Hassan AE. Studying just-in-time defect prediction using cross-project
                     models. Empirical Software Engineering, 2016, 21(5): 2072–2106. [doi: 10.1007/s10664-015-9400-x]
                  [7]   Ge  J,  Yu  HQ,  Fan  GS,  Tang  JH,  Huang  ZJ.  Just-in-time  defect  prediction  for  intelligent  computing  frameworks.  Ruan  Jian  Xue
                     Bao/Journal of Software, 2023, 34(9): 3966–3980 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6874.htm [doi: 10.
                     13328/j.cnki.jos.006874]
                  [8]   Jiang T, Tan L, Kim S. Personalized defect prediction. In: Proc. of the 28th IEEE/ACM Int’l Conf. on Automated Software Engineering
                     (ASE). Silicon Valley: IEEE, 2013. 279–289. [doi: 10.1109/ASE.2013.6693087]
                  [9]   Kamei  Y,  Shihab  E,  Adams  B,  Hassan  AE,  Mockus  A,  Sinha  A,  Ubayashi  N.  A  large-scale  empirical  study  of  just-in-time  quality
                     assurance. IEEE Trans. on Software Engineering, 2013, 39(6): 757–773. [doi: 10.1109/TSE.2012.70]
                 [10]   Li ZQ, Zhang HY, Jing XY, Xie JY, Guo M, Ren J. DSSDPP: Data selection and sampling based domain programming predictor for
                     cross-project defect prediction. IEEE Trans. on Software Engineering, 2023, 49(4): 1941–1963. [doi: 10.1109/TSE.2022.3204589]
                 [11]   Xia X, Lo D, Pan SJ, Nagappan N, Wang XY. HYDRA: Massively compositional model for cross-project defect prediction. IEEE Trans.
                     on Software Engineering, 2016, 42(10): 977–998. [doi: 10.1109/TSE.2016.2543218]
                 [12]   Śliwerski J, Zimmermann T, Zeller A. When do changes induce fixes? ACM SIGSOFT Software Engineering Notes, 2005, 30(4): 1–5.
                     [doi: 10.1145/1082983.1083147]
                 [13]   Kim S, Zimmermann T, Jr Pan K, Whitehead E. Automatic identification of bug-introducing changes. In: Proc. of the 21st IEEE/ACM Int’l
                     Conf. on Automated Software Engineering (ASE 2006). Tokyo: IEEE, 2006. 81–90. [doi: 10.1109/ASE.2006.23]
                 [14]   Da Costa DA, McIntosh S, Shang WY, Kulesza U, Coelho R, Hassan AE. A framework for evaluating the results of the SZZ approach for
                     identifying bug-introducing changes. IEEE Trans. on Software Engineering, 2017, 43(7): 641–657. [doi: 10.1109/TSE.2016.2616306]
                 [15]   Neto EC, Da Costa DA, Kulesza U. The impact of refactoring changes on the SZZ algorithm: An empirical study. In: Proc. of the 25th
                     IEEE  Int’l  Conf.  on  Software  Analysis,  Evolution  and  Reengineering  (SANER).  Campobasso:  IEEE,  2018.  380–390.  [doi:  10.1109/
                     SANER.2018.8330225]
                 [16]   Tantithamthavorn C, Hassan AE, Matsumoto K. The impact of class rebalancing techniques on the performance and interpretation of
   183   184   185   186   187   188   189   190   191   192   193