Page 473 - 《软件学报》2025年第9期

P. 473

4384 软件学报 2025 年第 36 卷第 9 期

如不能够很好地对透明物体进行分割, 如图 9 第 1 行所示, 本文方法没有准确地区分中间的瓶子. 此外, 本文方法
在对于小物体的分割上也存在不足, 如图 9 第 2 行所示, 本文方法没有成功识别出图中的猫.

图 9 PASCAL VOC 2012 验证集上的分割结果

4 结论

为了解决初始类激活图中前景区域过小、背景噪声过多的问题, 本文构建了一种基于 ViT 的类激活图联合优
化框架. 首先, 针对 ViT 生成的原始类与块间注意力中存在的误差, 设计了一种语义调制策略, 利用区域块间注意
力的语义上下文信息对其进行修正, 提高其准确性; 之后, 综合利用修正后的类与块间注意力以及区域块间注意力
对初始类激活图进行联合优化. 最终得到的类激活图在准确覆盖目标区域的同时较好地抑制了背景噪声. 一系列
的对比实验充分证明了本文所提方法的优越性及其有效性.

References:
[1] Bai C, Huang L, Chen JN, Pan X, Chen SY. Optimization of deep convolutional neural network for large scale image classification. Ruan
Jian Xue Bao/Journal of Software, 2018, 29(4): 1029–1038 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5404.
htm [doi: 10.13328/j.cnki.jos.005404]
[2] Tian X, Wang L, Ding Q. Review of image semantic segmentation based on deep learning. Ruan Jian Xue Bao/Journal of Software, 2019,
30(2): 440–468 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5659.htm [doi: 10.13328/j.cnki.jos.005659]
[3] Khoreva A, Benenson R, Hosang J, Hein M, Schiele B. Simple does it: Weakly supervised instance and semantic segmentation. In: Proc.
of the 2017 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 876–885. [doi: 10.1109/CVPR.2017.
181]
[4] Lin D, Dai JF, Jia JY, He KM, Sun J. ScribbleSup: Scribble-supervised convolutional networks for semantic segmentation. In: Proc. of
the 2016 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 3159–3167. [doi: 10.1109/CVPR.2016.
344]
[5] Bearman A, Russakovsky O, Ferrari V, Fei-Fei L. What’s the point: Semantic segmentation with point supervision. In: Proc. of the 14th
European Conf. on Computer Vision. Amsterdam: Springer, 2016. 549–565. [doi: 10.1007/978-3-319-46478-7_34]
[6] Wang YD, Zhang J, Kan MN, Shan SG, Chen XL. Self-supervised equivariant attention mechanism for weakly supervised semantic
segmentation. In: Proc. of the 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020. 12275–12284.
[doi: 10.1109/CVPR42600.2020.01229]
[7] Ahn J, Kwak S. Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In:
Proc. of the 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 4981–4990. [doi: 10.1109/
CVPR.2018.00523]

468 469 470 471 472 473 474 475 476 477 478