Page 282 - 《软件学报》2025年第4期

P. 282

1688 软件学报 2025 年第 36 卷第 4 期

[16] Zheng CM, Wu ZW, Feng JH, Fu Z, Cai Y. MNRE: A challenge multimodal dataset for neural relation extraction with visual evidence in
social media posts. In: Proc. of the 2021 IEEE Int’l Conf. on Multimedia and Expo (ICME). Shenzhen: IEEE, 2021. 1–6. [doi: 10.1109/
ICME51207.2021.9428274]
[17] Tong MH, Wang S, Cao YX, Xu B, Li JZ, Hou L, Chua TS. Image enhanced event detection in news articles. In: Proc. of the 34th AAAI
Conf. on Artificial Intelligence. New York: AAAI, 2020. 9040–9047. [doi: 10.1609/aaai.v34i05.6437]
[18] Chiu JPC, Nichols E. Named entity recognition with bidirectional LSTM-CNNs. Trans. of the Association for Computational Linguistics,
2016, 4: 357–370. [doi: 10.1162/tacl_a_00104]
[19] Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C. Neural architectures for named entity recognition. In: Proc. of the 2016
Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego:
ACL, 2016. 260–270. [doi: 10.18653/v1/N16-1030]
[20] Li J, Sun AX, Han JL, Li CL. A survey on deep learning for named entity recognition. IEEE Trans. on Knowledge and Data Engineering,
2022, 34(1): 50–70. [doi: 10.1109/TKDE.2020.2981314]
[21] Adel H, Schütze H. Global normalization of convolutional neural networks for joint entity and relation classification. In: Proc. of the 2017
Conf. on Empirical Methods in Natural Language Processing. Copenhagen: ACL, 2017. 1723–1729. [doi: 10.18653/v1/D17-1181]
[22] Ahn D. The stages of event extraction. In: Proc. of the 2006 Workshop on Annotating and Reasoning about Time and Events. Sydney:

ACL, 2006. 1–8.
[23] Zhang RJ, Dai L, Wang B, Guo P. Recent advances of Chinese named entity recognition based on deep learning. Journal of Chinese
Information Processing, 2022, 36(6): 20–35 (in Chinese with English abstract). [doi: 10.3969/j.issn.1003-0077.2022.06.002]
[24] Arshad O, Gallo I, Nawaz S, Calefati A. Aiding intra-text representations with visual context for multimodal named entity recognition. In:
Proc. of the 2019 Int’l Conf. on Document Analysis and Recognition (ICDAR). Sydney: IEEE, 2019. 337–342. [doi: 10.1109/ICDAR.
2019.00061]
[25] Wu ZW, Zheng CM, Cai Y, Chen JY, Leung HF, Li Q. Multimodal representation with embedded visual guiding objects for named entity
recognition in social media posts. In: Proc. of the 28th ACM Int’l Conf. on Multimedia. Seattle: ACM, 2020. 1038–1046. [doi: 10.1145/
3394171.3413650]
[26] Yu JF, Jiang J, Yang L, Xia R. Improving multimodal named entity recognition via entity span detection with unified multimodal
Transformer. In: Proc. of the 58th Annual Meeting of the Association for Computational Linguistics. ACL, 2020. 3342–3352. [doi: 10.
18653/v1/2020.acl-main.306]
[27] Wang XY, Gui M, Jiang Y, Jia ZX, Bach N, Wang T, Huang ZQ, Huang F, Tu KW. ITA: Image-text alignments for multi-modal named
entity recognition. In: Proc. of the 2022 Conf. of the North American Chapter of the Association for Computational Linguistics: Human
Language Technologies. Seattle: ACL, 2022. 3176–3189. [doi: 10.18653/v1/2022.naacl-main.232]
[28] Xu B, Huang SZ, Sha CF, Wang HY. MAF: A general matching and alignment framework for multimodal named entity recognition. In:
Proc. of the 15th ACM Int’l Conf. on Web Search and Data Mining. ACM, 2022. 1215–1223. [doi: 10.1145/3488560.3498475]
[29] Sun L, Wang JQ, Zhang K, Su YD, Weng FS. RpBERT: A text-image relation propagation-based BERT model for multimodal NER. In:
Proc. of the 35th AAAI Conf. on Artificial Intelligence. AAAI, 2021. 13860–13868. [doi: 10.1609/aaai.v35i15.17633]
[30] Huang SZ. Research on general multimodal information extraction for social media [MS. Thesis]. Shanghai: Donghua University, 2022
(in Chinese with English abstract). [doi: 10.27012/d.cnki.gdhuu.2022.002241]
[31] Zheng CM, Feng JH, Fu Z, Cai Y, Li Q, Wang T. Multimodal relation extraction with efficient graph alignment. In: Proc. of the 29th
ACM Int’l Conf. on Multimedia. ACM, 2021. 5298–5306. [doi: 10.1145/3474085.3476968]
[32] Sun L, Wang JQ, Su YD, Weng FS, Sun YX, Zheng ZW, Chen YY. RIVA: A pre-trained tweet multimodal model based on text-image
relation for multimodal NER. In: Proc. of the 28th Int’l Conf. on Computational Linguistics. Barcelona: ACL, 2020. 1852–1862. [doi: 10.
18653/v1/2020.coling-main.168]
[33] Zhao F, Li CH, Wu Z, Xing SY, Dai XY. Learning from different text-image pairs: A relation-enhanced graph convolutional network for
multimodal NER. In: Proc. of the 30th ACM Int’l Conf. on Multimedia. Lisboa: ACM, 2022. 3983–3992. [doi: 10.1145/3503161.354822]
[34] Zheng CM, Wu ZW, Wang T, Cai Y, Li Q. Object-aware multimodal named entity recognition in social media posts with adversarial
learning. IEEE Trans. on Multimedia, 2021, 23: 2520–2532. [doi: 10.1109/TMM.2020.3013398]
[35] Li XY, Feng JR, Meng YX, Han QH, Wu F, Li JW. A unified MRC framework for named entity recognition. In: Proc. of the 58th Annual
Meeting of the Association for Computational Linguistics. ACL, 2020. 5849–5859. [doi: 10.18653/v1/2020.acl-main.519]
[36] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proc. of the 3rd Int’l Conf. on
Learning Representations. 2015. [doi: 10.48550/arXiv.1409.1556]
[37] Yatskar M, Zettlemoyer L, Farhadi A. Situation recognition: Visual semantic role labeling for image understanding. In: Proc. of the 2016

277 278 279 280 281 282 283 284 285 286 287