Page 262 - 《软件学报》2021年第8期
P. 262
2544 Journal of Software 软件学报 Vol.32, No.8, August 2021
[99] Ren M, Kiros R, Zemel R. Image question answering: A visual semantic embedding model and a new dataset. arXiv preprint
arXiv:1505.02074, 2015.
[100] Antol S, Agrawal A, Lu J, Antol S, Mitchell M, Zitnick L, Batra D, Parikh D. VQA: Visual question answering. In: Proc. of the
IEEE Conf. on Computer Vision and Pattern Recognition. 2015. 2425−2433.
[101] Huk Park D, Anne Hendricks L, Akata Z, Rohrbach A, Schiele B, Darrell T, Rohrbach M. Multimodal explanations: Justifying
decisions and pointing to the evidence. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018.
8779−8788. [doi: 10.1109/CVPR.2018.00915]
[102] Johnson J, Hariharan B, van der Maaten L, Li F, Zitnick CL, Girshick R. Clevr: A diagnostic dataset for compositional language
and elementary visual reasoning. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 2901−2910. [doi:
10.1109/CVPR.2017.215]
[103] Lin TY, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollar P. Microsoft coco:
Common objects in context. In: Proc. of the European Conf. on Computer Vision (ECCV). 2014. 740−755.
[104] Borisyuk F, Gordo A, Sivakumar V. Rosetta: Large scale system for text detection and recognition in images. In: Proc. of the 24th
ACM SIGKDD Int’l Conf. on Knowledge Discovery & Data Mining. 2018. 71−79.
[105] Chattopadhyay P, Vedantam R, Selvaraju RR, Batra D, Parikh D. Counting everyday objects in everyday scenes. In: Proc. of the
IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 1135−1144. [doi: 10.1109/CVPR.2017.471]
[106] Trott A, Xiong C, Socher R. Interpretable counting for visual question answering. In: Proc. of the Int’l Conf. on Learning
Representations. 2017. 133−138.
[107] Zitnick CL, Agrawal A, Antol S, Mitchell M, Batra D, Parikh D. Measuring machine intelligence through visual question
answering. AI Magazine, 2016,37(1):63−72.
[108] Wu Z, Palmer M. Verb semantics and lexical selection. In: Proc. of the Conf. on Association for Computational Linguistics. 1994.
附中文参考文献:
[12] 鲜光靖,黄永忠.基于神经网络的视觉问答技术研究综述.网络安全技术与应用,2018(1):42−47.
[13] 俞俊,汪亮,余宙.视觉问答技术研究.计算机研究与发展,2018,55(9):1946−1958.
包希港(1997-),男,博士生,主要研究领域 肖克晶(1991-),女,博士生,主要研究领域
为视觉问答,知识库问答. 为自然语言处理,深度学习,数据挖掘.
周春来(1976-),男,博士,副教授,CCF 专 覃飙(1972-),男,博士,副教授,博士生导
业会员,主要研究领域为人工智能不确 师,CCF 专业会员,主要研究领域为人工智
定性. 能,因果分析和不确定数据库.