Page 325 - 《软件学报》2020年第11期

P. 325

软件学报 ISSN 1000-9825, CODEN RUXUEW E-mail: jos@iscas.ac.cn
Journal of Software,2020,31(11):3640−3656 [doi: 10.13328/j.cnki.jos.005828] http://www.jos.org.cn
©中国科学院软件研究所版权所有. Tel: +86-10-62562563

∗
基于对象位置线索的弱监督图像语义分割方法

1,2
2
2
李阳 , 刘扬 , 刘国军 , 郭茂祖 1,2,3
1
(北京建筑大学电气与信息工程学院,北京 100044)
2
(哈尔滨工业大学计算机科学与技术学院,黑龙江哈尔滨 150001)
3 (建筑大数据智能处理方法研究北京市重点实验室(北京建筑大学),北京 100044)
通讯作者: 刘扬, E-mail: yliu76@hit.edu.cn; 郭茂祖, E-mail: guomaozu@bucea.edu.cn

摘要: 深度卷积神经网络使用像素级标注,在图像语义分割任务中取得了优异的分割性能.然而,获取像素级标
注是一项耗时并且代价高的工作.为了解决这个问题,提出一种基于图像级标注的弱监督图像语义分割方法.该方法
致力于使用图像级标注获取有效的伪像素标注来优化分割网络的参数.该方法分为 3 个步骤:(1) 首先,基于分类与
分割共享的网络结构,通过空间类别得分(图像二维空间上像素点的类别得分)对网络特征层求导,获取具有类别信
息的注意力图;(2) 采用逐次擦除法产生显著图,用于补充注意力图中缺失的对象位置信息;(3) 融合注意力图与显
著图来生成伪像素标注并训练分割网络.在 PASCAL VOC 2012 分割数据集上的一系列对比实验,证明了该方法的
有效性及其优秀的分割性能.
关键词: 图像语义分割;弱监督;深度卷积神经网络;注意力图;显著图
中图法分类号: TP391

中文引用格式: 李阳,刘扬,刘国军,郭茂祖.基于对象位置线索的弱监督图像语义分割方法.软件学报,2020,31(11):3640−3656.
http://www.jos.org.cn/1000-9825/5828.htm
英文引用格式: Li Y, Liu Y, Liu GJ, Guo MZ. Weakly supervised image semantic segmentation method based on object location
cues. Ruan Jian Xue Bao/Journal of Software, 2020,31(11):3640−3656 (in Chinese). http://www.jos.org.cn/1000-9825/5828.htm

Weakly Supervised Image Semantic Segmentation Method Based on Object Location Cues
1,2
2
2
LI Yang , LIU Yang , LIU Guo-Jun , GUO Mao-Zu 1,2,3
1 (School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China)
2 (School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China)
3 (Beijing Key Laboratory of Intelligent Processing for Building Big Data (Beijing University of Civil Engineering and Architecture),
Beijing 100044, China)

Abstract: Deep convolutional neural networks have achieved excellent performance in image semantic segmentation with strong
pixel-level annotations. However, pixel-level annotations are very expensive and time-consuming. To overcome this problem, this study
proposes a new weakly supervised image semantic segmentation method with image-level annotations. The proposed method consists of
three steps: (1) Based on the sharing network for classification and segmentation task, the class-specific attention map is obtained which is
the derivative of the spatial class scores (the class scores of pixels in the two-dimensional image space) with respect to the network feature
maps; (2) Saliency map is gotten by successive erasing method, which is used to supplement the object localization information missing
by attention maps; (3) Attention map is combined with saliency map to generate pseudo pixel-level annotations and train the segmentation

∗ 基金项目: 国家自然科学基金(61671188, 61571164); 国家重点研发计划(2016YFC0901902)
Foundation item: National Natural Science Foundation of China (61671188, 61571164); National Key Research and Development
Program of China (2016YFC0901902)
收稿时间: 2018-04-28; 修改时间: 2018-11-06; 采用时间: 2019-02-28; jos 在线出版时间: 2019-08-09
CNKI 网络优先出版: 2019-08-12 12:08:06, http://kns.cnki.net/kcms/detail/11.2560.TP.20190812.1207.006.html

320 321 322 323 324 325 326 327 328 329 330