Page 184 - 《软件学报》2025年第4期
P. 184

软件学报 ISSN 1000-9825, CODEN RUXUEW                                        E-mail: jos@iscas.ac.cn
                 2025,36(4):1590−1603 [doi: 10.13328/j.cnki.jos.007213] [CSTR: 32375.14.jos.007213]  http://www.jos.org.cn
                 ©中国科学院软件研究所版权所有.                                                          Tel: +86-10-62562563



                                                              *
                 融合任务知识的多模态知识图谱补全

                 陈    强,    张    栋,    李寿山,    周国栋


                 (苏州大学 计算机科学与技术学院, 江苏 苏州 215006)
                 通信作者: 张栋, E-mail: dzhang@suda.edu.cn

                 摘 要: 知识图谱补全任务旨在根据已有的事实三元组                  (头实体、关系、尾实体) 来挖掘知识图谱中缺失的事实三
                 元组. 现有的研究工作主要致力于利用知识图谱中的结构信息来进行知识图谱补全任务. 然而, 这些工作忽略了知
                 识图谱中蕴含的其他模态的信息也可能对知识图谱补全有帮助. 并且, 由于基于特定任务的知识通常没有被注入
                 通用的预训练模型, 因而如何在抽取模态信息的过程中融合任务的相关知识变得至关重要. 此外, 因为不同模态特
                 征对于知识图谱补全的贡献不一样, 所以如何有效地保留有用的多模态信息也是一大挑战. 为了解决上述问题, 提

                 出一种融合任务知识的多模态知识图谱补全方法. 利用在当前任务上微调过的多模态编码器, 来获取不同模态下
                 的实体向量表示. 并且, 通过一个基于循环神经网络的模态融合过滤模块, 去除与任务无关的多模态特征. 最后, 利
                 用同构图网络表征并更新所有特征, 从而有效地完成多模态知识图谱补全任务. 实验结果表明, 所提出的方法能有
                 效地抽取不同模态的信息, 并且能够通过进一步的多模态过滤融合来增强实体的表征能力, 进而提高多模态知识
                 图谱补全任务的性能.
                 关键词: 知识图谱补全; 多模态; 知识融合; 多模态融合
                 中图法分类号: TP18

                 中文引用格式: 陈强, 张栋, 李寿山, 周国栋. 融合任务知识的多模态知识图谱补全. 软件学报, 2025, 36(4): 1590–1603. http://www.
                 jos.org.cn/1000-9825/7213.htm
                 英文引用格式: Chen Q, Zhang D, Li SS, Zhou GD. Task Knowledge Fusion for Multimodal Knowledge Graph Completion. Ruan Jian
                 Xue Bao/Journal of Software, 2025, 36(4): 1590–1603 (in Chinese). http://www.jos.org.cn/1000-9825/7213.htm

                 Task Knowledge Fusion for Multimodal Knowledge Graph Completion
                 CHEN Qiang, ZHANG Dong, LI Shou-Shan, ZHOU Guo-Dong
                 (School of Computer Science and Technology, Soochow University, Suzhou 215006, China)
                 Abstract:  The  task  of  completing  knowledge  graphs  aims  to  reveal  the  missing  fact  triples  within  the  knowledge  graph  based  on  existing
                 fact  triples  (head  entity,  relation,  tail  entity).  Existing  research  primarily  focuses  on  utilizing  the  structural  information  within  the
                 knowledge graph. However, these efforts overlook that other modal information contained within the knowledge graph may also be helpful
                 for  knowledge  graph  completion.  In  addition,  since  task-specific  knowledge  is  typically  not  integrated  into  general  pre-training  models,  the
                 process  of  incorporating  task-related  knowledge  into  modal  information  extraction  becomes  crucial.  Moreover,  given  that  different  modal
                 features  contribute  uniquely  to  knowledge  graph  completion,  effectively  preserving  useful  multimodal  information  poses  a  significant
                 challenge.  To  address  these  issues,  this  study  proposes  a  multimodal  knowledge  graph  completion  method  that  incorporates  task
                 knowledge.  It  utilizes  a  fine-tuned  multimodal  encoder  tailored  to  the  current  task  to  acquire  entity  vector  representations  across  different
                 modalities.  Subsequently,  a  modal  fusion-filtering  module  based  on  recurrent  neural  networks  is  utilized  to  eliminate  task-independent
                 multimodal  features.  Finally,  the  study  utilizes  a  simple  isomorphic  graph  network  to  represent  and  update  all  features,  thus  effectively
                 accomplishing  multimodal  knowledge  graph  completion.  Experimental  results  demonstrate  the  effectiveness  of  our  approach  in  extracting
                 information  from  different  modalities.  Furthermore,  it  shows  that  our  method  enhances  entity  representation  capability  through  additional
                 multimodal filtering and fusion, consequently improving the performance of multimodal knowledge graph completion tasks.


                 *    基金项目: 国家自然科学基金  (62206193, 62076176, 62076175)
                  收稿时间: 2023-08-25; 修改时间: 2023-11-03; 采用时间: 2024-04-16; jos 在线出版时间: 2024-07-03
                  CNKI 网络首发时间: 2024-07-05
   179   180   181   182   183   184   185   186   187   188   189