Page 47 - 《软件学报》2025年第5期

P. 47

董黎明等: 结合主动学习和半监督学习的软件可追踪性恢复框架 1947

[44] Dong LM, Zhang H, Liu W, Weng ZL, Kuang HY. Semi-supervised pre-processing for learning-based traceability framework on real-
world software projects. In: Proc. of the 30th ACM Joint European Software Engineering Conf. and Symp. on the Foundations of
Software Engineering. Singapore: ACM, 2022. 570–582. [doi: 10.1145/3540250.3549151]
[45] Le TDB, Linares-Vasquez M, Lo D, Poshyvanyk D. RCLinker: Automated linking of issue reports and commits leveraging rich
contextual information. In: Proc. of the 23rd IEEE Int’l Conf. on Program Comprehension. Florence: IEEE, 2015. 36–47. [doi: 10.1109/
icpc.2015.13]
[46] Cavnar WB. Using an n-gram-based document representation with a vector processing retrieval model. In: Harman DK, ed. Proc. of the
3rd Text Retrieval Conf. (TREC-3). Gaithersburg: National Institute of Standards and Technology, 1994. 269–278.
[47] Gethers M, Oliveto R, Poshyvanyk D, De Lucia A. On integrating orthogonal information retrieval methods to improve traceability
recovery. In: Proc. of the 27th IEEE Int’l Conf. on Software Maintenance. Williamsburg: IEEE, 2011. 133–142. [doi: 10.1109/icsm.2011.
6080780]
[48] Chen BH, Chen LL, Zhang C, Peng X. BuildFast: History-aware build outcome prediction for fast feedback and reduced cost in
continuous integration. In: Proc. of the 35th IEEE/ACM Int’l Conf. on Automated Software Engineering. Melbourne: ACM, 2020. 42–53.
[doi: 10.1145/3324884.3416616]
[49] Sun Y, Wang Q, Yang Y. FRLink: Improving the recovery of missing issue-commit links by revisiting file relevance. Information and
Software Technology, 2017, 84: 33–47. [doi: 10.1016/j.infsof.2016.11.010]
[50] Sohn K, Berthelot D, Li CL, Zhang ZZ, Carlini N, Cubuk ED, Kurakin A, Zhang H, Raffel C. FixMatch: Simplifying semi-supervised
learning with consistency and confidence. In: Proc. of the 34th Int’l Conf. on Neural Information Processing Systems. Vancouver: Curran
Associates Inc., 2020. 596–608.
[51] Zou Y, Yu ZD, Kumar BVK, Wang JS. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In:
Proc. of the 15th European Conf. on Computer Vision. Munich: Springer, 2018. 297–313. [doi: 10.1007/978-3-030-01219-9_18]
[52] Zou Y, Yu ZD, Liu XF, Kumar BVKV, Wang JS. Confidence regularized self-training. In: Proc. of the 2019 IEEE/CVF Int’l Conf. on
Computer Vision. Seoul: IEEE, 2019. 5981–5990. [doi: 10.1109/iccv.2019.00608]
[53] Wei C, Sohn K, Mellina C, Yuille A, Yang F. CReST: A class-rebalancing self-training framework for imbalanced semi-supervised
learning. In: Proc. of the 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021. 10852–10861. [doi:
10.1109/cvpr46437.2021.01071]
[54] Chen H, Fan Y, Wang YD, Wang JD, Schiele B, Xie X, Savvides M, Raj B. An embarrassingly simple baseline for imbalanced semi-
supervised learning. arXiv:2211.11086, 2022.
[55] Xu Y, Shang L, Ye JX, Qian Q, Li YF, Sun BG, Li H, Jin R. Dash: Semi-supervised learning with dynamic thresholding. In: Proc. of the
38th Int’l Conf. on Machine Learning. ICML, 2021. 11525–11536.
[56] Zhang BW, Wang YD, Hou WX, Wu H, Wang JD, Okumura M, Shinozaki T. FlexMatch: Boosting semi-supervised learning with
curriculum pseudo labeling. In: Proc. of the 34th Annual Conf. on Neural Information Processing Systems. 2021. 18408–18419.
[57] Mills C, Escobar-Avila J, Bhattacharya A, Kondyukov G, Chakraborty S, Haiduc S. Tracing with less data: Active learning for
classification-based traceability link recovery. In: Proc. of the 2019 IEEE Int’l Conf. on Software Maintenance and Evolution. Cleveland:
IEEE, 2019. 103–113. [doi: 10.1109/icsme.2019.00020]
[58] Du TB, Shen GH, Huang ZQ, Yu YS, Wu DX. Automatic traceability link recovery via active learning. Frontiers of Information
Technology & Electronic Engineering, 2020, 21(8): 1217–1225. [doi: 10.1631/fitee.1900222]
[59] Tharwat A, Schenck W. A survey on active learning: State-of-the-art, practical challenges and research directions. Mathematics, 2023,
11(4): 820. [doi: 10.3390/math11040820]
[60] Prenner JA, Robbes R. Making the most of small software engineering datasets with modern machine learning. IEEE Trans. on Software
Engineering, 2022, 48(12): 5050–5067. [doi: 10.1109/tse.2021.3135465]
[61] Lewis DD, Catlett J. Heterogeneous uncertainty sampling for supervised learning. In: Cohen WW, Hirsh H, eds. Machine Learning: Proc.
of the 11th Int’l Conf. New Brunswick: Elsevier, 1994. 148–156. [doi: 10.1016/b978-1-55860-335-6.50026-x]
[62] Scheffer T, Decomain C, Wrobel S. Active hidden Markov models for information extraction. In: Proc. of the 4th Int’l Symp. on
Intelligent Data Analysis. Cascais: Springer, 2001. 309–318. [doi: 10.1007/3-540-44816-0_31]
[63] Kothawade S, Reddy PK, Ramakrishnan G, Iyer R. BASIL: Balanced active semi-supervised learning for class imbalanced datasets.
arXiv:2203.05651, 2022.
[64] Kothawade S, Ghosh S, Shekhar S, Xiang Y, Iyer R. Talisman: Targeted active learning for object detection with rare classes and slices
using submodular mutual information. In: Proc. of the 17th European Conf. on Computer Vision. Tel Aviv: Springer, 2022. 1–16. [doi: 10.
1007/978-3-031-19839-7_1]

42 43 44 45 46 47 48 49 50 51 52