计算机集成制造系统 ›› 2020, Vol. 26 ›› Issue (5期): 1326-1335.DOI: 10.13196/j.cims.2020.05.018

• 当期目次 • 上一篇    下一篇

面向电子商务应用的知识图谱关联查询处理

岳昆,阚伊戎,王钰杰,钱文华+   

  1. 云南大学信息学院
  • 出版日期:2020-05-31 发布日期:2020-05-31
  • 基金资助:
    国家自然科学基金资助项目(U1802271,61562090,61662087);云南省基础研究杰出青年资助项目(2019FJ011);云南省应用基础研究计划重点资助项目(2019FA044);云南省中青年学术技术带头人后备人才资助项目(2019HB121);云南大学青年英才培育计划资助项目(WX173602)。

Correlation query processing of knowledge graph oriented to e-commerce applications

  • Online:2020-05-31 Published:2020-05-31
  • Supported by:
    Project supported by the National Natural Science Foundation,China(No.U1802271,61562090,61662087),the Science Foundation for Distinguished Young Scholars of Yunnan Province,China(No.2019FJ011),the Application Basic Research Program of Yunnan Province,China(No.2019FA044),the Yunnan Provincial Foundation for Leaders of Disciplines in Science and Technology,China(No.2019HB121),and the Program for Excellent Young Talents of Yunnan University,China(No.WX173602).

摘要: 为了对知识图谱(KG)中实体间的关联关系进行有效建模,以电子商务应用为背景,以贝叶斯网为知识表示和推理框架,提出将知识图谱中描述的领域知识与用户行为记录中蕴含的知识进行有效融合的方法,从而构建描述商品间关联关系及其不确定性的贝叶斯网,并基于贝叶斯网的推理算法计算商品间的间接关联关系。所提方法将为KG的关联查询处理提供支撑技术,为商品分类、用户定向和个性化推荐等典型应用提供解决方案。针对大规模KG和海量的用户行为记录,基于Spark给出模型构建和概率推理并行算法。通过真实数据的实验结果表明,所提KG关联查询处理方法能够以接近90%的召回率发现KG中未直接表示的关联关系,而且对包含超过1亿条边的KG也能高效地进行关联查询处理。

关键词: 电子商务, 知识图谱, 关联查询, 贝叶斯网, Spark计算引擎

Abstract: To model the correlations among entities in knowledge graph (KG) effectively,by taking electronic commerce application as the background and Bayesian network (BN) as the framework of knowledge representation and inference,an effective method was proposed to fuse the domain knowledge in KG and the knowledge implied in users' historical behavior records.Consequently,the BN that described the uncertain correlation among commodities was constructed,and the indirect correlations among commodities were computed based on BN inferences.The proposed method provided underlying techniques for correlation query processing of KG,and solution to classical e-commerce applications such as product classification,user targeting and recommendation.With respect to the large-scale KG and massive users' historical behavior records,the Spark-based parallel algorithms were given for model construction and probabilistic inferences.Experimental results on real data showed that the proposed method could be used to discover the associations that were not directly described in KGs with the recall close to 90%,and fulfilled the correlation query processing efficiently on the KG with more than 100 million edges.

Key words: e-commerce, knowledge graph, correlation query, Bayesian network, Spark compute engine

中图分类号: