Page 310 - 《软件学报》2025年第12期
P. 310

张明韬 等: 基于嵌入模型的知识图谱准确性评估                                                         5691


                 出基于嵌入模型的准确性评估方法, 并通过错误三元组形式假设构建负样本, 自动化选择阈值, 实验证明了本文选
                 取的阈值选择策略相对于无监督分类器或离群值检测方法的优越性. 此外在实验中, 本文衡量了一般情况下以及
                 结合三元组重要性时的准确性评估情况, 给出不同情况下的方法评估误差与运行成本进行分析, 阐明当前嵌入模
                 型衡量的优势、劣势以及在不同需求下、不同数据中对于嵌入模型选择的建议.
                    未来进一步研究, 将在评估误差、评估代价等方面开展研究, 以提高评估的可信性, 减少影响评估的偶然因
                 素. 具体研究内容可包括: 进一步优化嵌入模型, 提高其进行推理、验证的能力; 利用图谱过程中的信息来源等, 提
                   Label λ  判断的能力. 针对知识图谱稀疏时评估困难的问题, 本文认为数据方面通过数据增强的方式, 通过引入外
                 高
                 来知识图谱、进行知识图谱补全, 模型方面, 使用更为鲁棒的嵌入模型等方式, 可以降低稀疏性带来的影响. 此外,
                 三元组重要性的衡量是一个仍在探索的领域, 尽管目前实体重要性衡量仍主要采用网络结构与语义关系作为描述
                 重要性的重要特征, 构造含义更加丰富的三元组重要性定义并探究其对本文框架的影响, 仍然是一个未来待探究
                 的方向. 本文在未来进一步研究中希望通过进一步引入其他重要性特征、融合多种重要性特征等方式以实现更丰
                 富的度量.

                 References:
                  [1]   Suchanek FM, Kasneci G, Weikum G. YAGO: A core of semantic knowledge. In: Proc. of the 16th Int’l Conf. on World Wide Web.
                     Banff: ACM, 2007. 697–706. [doi: 10.1145/1242572.1242667]
                  [2]   Lehmann  J,  Isele  R,  Jakob  M,  Jentzsch  A,  Kontokostas  D,  Mendes  PN,  Hellmann  S,  Morsey  M,  van  Kleef  P,  Auer  S,  Bizer  C.
                     DBpedia—A large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web, 2015, 6(2): 167–195. [doi: 10.3233/SW-
                     140134]
                  [3]   Carlson A, Betteridge J, Kisiel B, Settles B, Hruschka E, Mitchell T. Toward an architecture for never-ending language learning. In: Proc.
                     of the 24th AAAI Conf. on Artificial Intelligence. Atlanta: AAAI, 2010. 1306–1313. [doi: 10.1609/aaai.v24i1.7519]
                  [4]   Bollacker  K,  Evans  C,  Paritosh  P,  Sturge  T,  Taylor  J.  Freebase:  A  collaboratively  created  graph  database  for  structuring  human
                     knowledge. In: Proc. of the 2008 ACM SIGMOD Int’l Conf. on Management of Data. Vancouver: ACM, 2008. 1247–1250. [doi: 10.1145/
                     1376616.1376746]
                  [5]   Ojha  P,  Talukdar  P.  KGEval:  Accuracy  estimation  of  automatically  constructed  knowledge  graphs.  In:  Proc.  of  the  2017  Conf.  on
                     Empirical Methods in Natural Language Processing. Copenhagen: ACL, 2017. 1741–1750. [doi: 10.18653/v1/D17-1183]
                  [6]   Wang YQ, Ma FL, Gao J. Efficient knowledge graph validation via cross-graph representation learning. In: Proc. of the 29th ACM Int’l
                     Conf. on Information & Knowledge Management. ACM, 2020. 1595–1604. [doi: 10.1145/3340531.3411902]
                  [7]   Gerber  D,  Esteves  D,  Lehmann  J,  Bühmann  R,  Usbeck  R,  Ngomo  ACN,  Speck  R.  DeFacto-temporal  and  multilingual  deep  fact
                     validation. Journal of Web Semantics, 2015, 35: 85–101. [doi: 10.1016/j.websem.2015.08.001]
                  [8]   Liu SY, d’Aquin M, Motta E. Measuring accuracy of triples in knowledge graphs. In: Proc. of the 1st Int’l Conf. on Language, Data, and
                     Knowledge. Galway: Springer, 2017. 343–357. [doi: 10.1007/978-3-319-59888-8_29]
                  [9]   Jia SB, Xiang Y, Chen XJ, Wang K, Shi J. Triple trustworthiness measurement for knowledge graph. In: Proc. of the 2019 World Wide
                     Web Conf. San Francisco: ACM, 2019. 2865–2871. [doi: 10.1145/3308558.3313586]
                 [10]   Sedova A, Roth B. ACTC: Active threshold calibration for cold-start knowledge graph completion. In: Proc. of the 61st Annual Meeting
                     of the Association for Computational Linguistics. Toronto: ACL, 2023. 1853–1863. [doi: 10.18653/v1/2023.acl-short.158]
                 [11]   Li Q, Li YL, Gao J, Su L, Zhao B, Demirbas M, Fan W, Han JW. A confidence-aware approach for truth discovery on long-tail data.
                     Proc. of the VLDB Endowment, 2014, 8(4): 425–436. [doi: 10.14778/2735496.2735505]
                 [12]   Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O. Translating embeddings for modeling multi-relational data. In: Proc. of
                     the 27th Int’l Conf. on Neural Information Processing Systems. Lake Tahoe: ACM, 2013. 2787–2795.
                 [13]   Trouillon T, Welbl J, Riedel S, Gaussier E, Bouchard G. Complex embeddings for simple link prediction. In: Proc. of the 33rd Int’l Conf.
                     on Machine Learning. New York: PMLR, 2016. 2071–2080.
                 [14]   Schlichtkrull M, Kipf TN, Bloem P, van den Berg R, Titov I, Welling M. Modeling relational data with graph convolutional networks. In:
                     Proc. of the 15th Int’l Conf. on the Semantic Web. Heraklion: Springer, 2018. 593–607. [doi: 10.1007/978-3-319-93417-4_38]
                 [15]   Sun ZQ, Deng ZH, Nie JY, Tang J. RotatE: Knowledge graph embedding by relational rotation in complex space. In: Proc. of the 7th Int’l
                     Conf. on Learning Representations. New Orleans: OpenReview.net, 2019.
                 [16]   Xie RB, Liu ZY, Lin F, Lin LY. Does William Shakespeare REALLY write Hamlet? Knowledge representation learning with confidence.
   305   306   307   308   309   310   311   312   313   314   315