借助实体邻域和关系上下文提升知识图谱补全
Enhancing Knowledge Graph Completion with Entity Neighborhood and Relation Context
摘要 Abstract
知识图谱补全(Knowledge Graph Completion, KGC)旨在推断知识图谱(Knowledge Graphs, KGs)中的缺失信息,以解决其固有的不完整性问题。传统基于结构的KGC方法虽然有效,但由于需要对知识图谱中的每个预测都进行密集嵌入学习和评分所有实体,因此面临显著的计算需求和可扩展性挑战。近期基于文本的方法通过利用像T5和BERT这样的语言模型将知识图谱三元组转换为文本进行推理,缓解了这些问题。然而,这些方法往往未能充分利用上下文信息,主要关注实体的邻域而忽略了关系的上下文。为了解决这一问题,我们提出了KGC-ERC框架,该框架整合了两种类型的上下文,以丰富生成式语言模型的输入并增强其推理能力。此外,我们引入了一种采样策略,在输入标记约束内有效地选择相关上下文,这优化了上下文信息的利用并可能提高模型性能。在Wikidata5M、Wiki27K和FB15K-237-N数据集上的实验表明,KGC-ERC在预测性能和可扩展性方面优于或匹配最先进的基线方法。
Knowledge Graph Completion (KGC) aims to infer missing information in Knowledge Graphs (KGs) to address their inherent incompleteness. Traditional structure-based KGC methods, while effective, face significant computational demands and scalability challenges due to the need for dense embedding learning and scoring all entities in the KG for each prediction. Recent text-based approaches using language models like T5 and BERT have mitigated these issues by converting KG triples into text for reasoning. However, they often fail to fully utilize contextual information, focusing mainly on the neighborhood of the entity and neglecting the context of the relation. To address this issue, we propose KGC-ERC, a framework that integrates both types of context to enrich the input of generative language models and enhance their reasoning capabilities. Additionally, we introduce a sampling strategy to effectively select relevant context within input token constraints, which optimizes the utilization of contextual information and potentially improves model performance. Experiments on the Wikidata5M, Wiki27K, and FB15K-237-N datasets show that KGC-ERC outperforms or matches state-of-the-art baselines in predictive performance and scalability.