Page 145 - 《软件学报》2025年第9期
P. 145

软件学报 ISSN 1000-9825, CODEN RUXUEW                                        E-mail: jos@iscas.ac.cn
                 2025,36(9):4056−4071 [doi: 10.13328/j.cnki.jos.007247] [CSTR: 32375.14.jos.007247]  http://www.jos.org.cn
                 ©中国科学院软件研究所版权所有.                                                          Tel: +86-10-62562563



                                                        *
                 基于相关性提示的知识图谱问答

                 马    杰  1,2 ,    孙望淳  1,2 ,    王平辉  1,2 ,    张若非  1,2 ,    李帅鹏  2 ,    苏    洲  1,2


                 1
                  (西安交通大学 网络空间安全学院, 陕西 西安 710049)
                 2
                  (智能网络与网络安全教育部重点实验室          (西安交通大学), 陕西 西安 710049)
                 通信作者: 马杰, E-mail: jiema@xjtu.edu.cn

                 摘 要: 大语言模型      (large language model, LLM) 随着不断发展, 在开放领域取得了出色的表现. 然而, 由于缺乏专
                 业知识, LLM   在垂直领域问答任务上效果较差. 这一问题引发了研究者的广泛关注. 现有研究通过“检索-问答”的
                 方式, 将领域知识注入大语言模型, 以增强其性能. 然而该方式通常会检索到额外的噪声数据而导致                               LLM  的性能
                 损失. 为了解决该问题, 提出基于知识相关性的知识图谱问答方法. 具体而言, 将噪声数据与回答问题所需要的知
                 识进行区分, 在“检索-相关性评估-问答”的框架下, 引导大语言模型选择合理的知识做出正确的回答. 此外, 提出一
                 个机械领域知识图谱问答的数据集            Mecha-QA, 包含传统机械制造以及增材制造两个子领域, 以推进该领域大语言
                 模型与知识图谱问答相关的研究. 为了验证所提方法的有效性, 在                    Mecha-QA  和航空航天领域数据集        Aero-QA  上
                 进行实验. 结果表明, 该方法可以显著提升大语言模型在垂直领域知识图谱问答的性能.
                 关键词: 大语言模型; 知识图谱; 垂直领域; 问答系统; 知识检索
                 中图法分类号: TP18


                 中文引用格式: 马杰, 孙望淳, 王平辉, 张若非, 李帅鹏, 苏洲. 基于相关性提示的知识图谱问答. 软件学报, 2025, 36(9): 4056–4071.
                 http://www.jos.org.cn/1000-9825/7247.htm
                 英文引用格式: Ma  J,  Sun  WC,  Wang  PH,  Zhang  RF,  Li  SP,  Su  Z.  Knowledge  Graph  Question  Answering  Based  on  Relevance
                 Prompts. Ruan Jian Xue Bao/Journal of Software, 2025, 36(9): 4056–4071 (in Chinese). http://www.jos.org.cn/1000-9825/7247.htm

                 Knowledge Graph Question Answering Based on Relevance Prompts
                                     1,2
                                                    1,2
                      1,2
                                                                                2
                                                                   1,2
                 MA Jie , SUN Wang-Chun , WANG Ping-Hui , ZHANG Ruo-Fei , LI Shuai-Peng , SU Zhou 1,2
                 1
                 (School of Cyber Science and Engineering, Xi’an Jiaotong University, Xi’an 710049, China)
                 2
                 (Ministry of Education Key Laboratory for Intelligent Networks and Network Security (Xi’an Jiaotong University), Xi’an 710049, China)
                 Abstract:  As large language models (LLMs) continue to evolve, they have shown impressive performance in open-domain tasks. However,
                 they  exhibit  limited  effectiveness  in  domain-specific  question-answering  due  to  a  lack  of  domain-specific  knowledge.  This  limitation  has
                 attracted  widespread  attention  from  researchers  in  the  field.  Current  research  attempts  to  infuse  domain  knowledge  into  LLMs  through  a
                 retrieve-answer  approach  to  enhance  their  performance.  However,  this  method  often  retrieves  additional,  irrelevant  data,  leading  to  a
                 degradation  in  LLM  effectiveness.  Therefore,  this  study  proposes  a  method  for  knowledge  graph  question  answering  based  on  the
                 relevance  of  knowledge.  This  method  focuses  on  distinguishing  essential  knowledge  required  for  specific  questions  from  noisy  data.  Under
                 a  framework  of  retrieval-relevance  assessment-answering,  this  method  guides  LLMs  to  select  appropriate  knowledge  for  accurate  answers.
                 Moreover,  this  study  introduces  a  dataset  named  Mecha-QA  for  question-answering  using  a  mechanical  domain  knowledge  graph,  covering
                 traditional  machinery  manufacturing  and  additive  manufacturing,  to  promote  research  that  integrates  LLMs  with  knowledge  graph  question
                 answering  in  this  field.  To  validate  the  effectiveness  of  the  proposed  method,  experiments  are  conducted  on  the  Aero-QA  dataset  in  the
                 aerospace  domain  and  the  Mecha-QA  dataset.  Results  demonstrate  that  the  proposed  method  significantly  improves  the  performance  of


                 *    基金项目: 国家重点研发计划  (2021YFB1715600); 国家自然科学基金  (62306229)
                  马杰和孙望淳为共同第一作者.
                  收稿时间: 2024-01-31; 修改时间: 2024-05-04; 采用时间: 2024-06-26; jos 在线出版时间: 2024-12-31
                  CNKI 网络首发时间: 2025-01-02
   140   141   142   143   144   145   146   147   148   149   150