Page 184 - 《软件学报》2025年第4期

P. 184

软件学报 ISSN 1000-9825, CODEN RUXUEW E-mail: jos@iscas.ac.cn
2025,36(4):1590−1603 [doi: 10.13328/j.cnki.jos.007213] [CSTR: 32375.14.jos.007213] http://www.jos.org.cn
©中国科学院软件研究所版权所有. Tel: +86-10-62562563

*
融合任务知识的多模态知识图谱补全

陈强, 张栋, 李寿山, 周国栋

(苏州大学计算机科学与技术学院, 江苏苏州 215006)
通信作者: 张栋, E-mail: dzhang@suda.edu.cn

摘要: 知识图谱补全任务旨在根据已有的事实三元组 (头实体、关系、尾实体) 来挖掘知识图谱中缺失的事实三
元组. 现有的研究工作主要致力于利用知识图谱中的结构信息来进行知识图谱补全任务. 然而, 这些工作忽略了知
识图谱中蕴含的其他模态的信息也可能对知识图谱补全有帮助. 并且, 由于基于特定任务的知识通常没有被注入
通用的预训练模型, 因而如何在抽取模态信息的过程中融合任务的相关知识变得至关重要. 此外, 因为不同模态特
征对于知识图谱补全的贡献不一样, 所以如何有效地保留有用的多模态信息也是一大挑战. 为了解决上述问题, 提

出一种融合任务知识的多模态知识图谱补全方法. 利用在当前任务上微调过的多模态编码器, 来获取不同模态下
的实体向量表示. 并且, 通过一个基于循环神经网络的模态融合过滤模块, 去除与任务无关的多模态特征. 最后, 利
用同构图网络表征并更新所有特征, 从而有效地完成多模态知识图谱补全任务. 实验结果表明, 所提出的方法能有
效地抽取不同模态的信息, 并且能够通过进一步的多模态过滤融合来增强实体的表征能力, 进而提高多模态知识
图谱补全任务的性能.
关键词: 知识图谱补全; 多模态; 知识融合; 多模态融合
中图法分类号: TP18

中文引用格式: 陈强, 张栋, 李寿山, 周国栋. 融合任务知识的多模态知识图谱补全. 软件学报, 2025, 36(4): 1590–1603. http://www.
jos.org.cn/1000-9825/7213.htm
英文引用格式: Chen Q, Zhang D, Li SS, Zhou GD. Task Knowledge Fusion for Multimodal Knowledge Graph Completion. Ruan Jian
Xue Bao/Journal of Software, 2025, 36(4): 1590–1603 (in Chinese). http://www.jos.org.cn/1000-9825/7213.htm

Task Knowledge Fusion for Multimodal Knowledge Graph Completion
CHEN Qiang, ZHANG Dong, LI Shou-Shan, ZHOU Guo-Dong
(School of Computer Science and Technology, Soochow University, Suzhou 215006, China)
Abstract: The task of completing knowledge graphs aims to reveal the missing fact triples within the knowledge graph based on existing
fact triples (head entity, relation, tail entity). Existing research primarily focuses on utilizing the structural information within the
knowledge graph. However, these efforts overlook that other modal information contained within the knowledge graph may also be helpful
for knowledge graph completion. In addition, since task-specific knowledge is typically not integrated into general pre-training models, the
process of incorporating task-related knowledge into modal information extraction becomes crucial. Moreover, given that different modal
features contribute uniquely to knowledge graph completion, effectively preserving useful multimodal information poses a significant
challenge. To address these issues, this study proposes a multimodal knowledge graph completion method that incorporates task
knowledge. It utilizes a fine-tuned multimodal encoder tailored to the current task to acquire entity vector representations across different
modalities. Subsequently, a modal fusion-filtering module based on recurrent neural networks is utilized to eliminate task-independent
multimodal features. Finally, the study utilizes a simple isomorphic graph network to represent and update all features, thus effectively
accomplishing multimodal knowledge graph completion. Experimental results demonstrate the effectiveness of our approach in extracting
information from different modalities. Furthermore, it shows that our method enhances entity representation capability through additional
multimodal filtering and fusion, consequently improving the performance of multimodal knowledge graph completion tasks.

* 基金项目: 国家自然科学基金 (62206193, 62076176, 62076175)
收稿时间: 2023-08-25; 修改时间: 2023-11-03; 采用时间: 2024-04-16; jos 在线出版时间: 2024-07-03
CNKI 网络首发时间: 2024-07-05

179 180 181 182 183 184 185 186 187 188 189