• 论文 •    

面向知识与信息管理的领域本体自动构建算法

侯鑫,张旭堂,金天国,彭高亮,刘文剑   

  1. 1.哈尔滨工业大学 机电工程学院,黑龙江哈尔滨150001;2.哈尔滨工业大学 计算机科学与技术学院,黑龙江哈尔滨150001
  • 出版日期:2011-01-15 发布日期:2011-01-25

Automatic construction of domain ontology oriented to knowledge and information management

HOU Xin, ZHANG Xu-tang, JIN Tian-guo,PEGN Gao-liang, LIU Wen-jian   

  1. 1.School of Mechatronics Engineering, Harbin Institute of Technology, Harbin 150001, China;2.School of Computer &Science Technology, Harbin Institute of Technology, Harbin 150001, China
  • Online:2011-01-15 Published:2011-01-25

摘要: 针对已有领域本体构建算法的不足,提出了一种基于图的面向知识与信息管理的领域本体自动构建算法,包括概念抽取和关系提取。将领域文本文档映射为文档概念图,采用基于图上随机游走的词汇加权算法从全局和局部两方面衡量词汇的重要性,利用图顶点聚类算法对词汇进行分类以产生候选概念。提出了基于约束条件下频繁信息子图挖掘的概念间任意关系提取算法,并引入信息函数对子图的信息量进行评价,得到的领域概念和概念间的关系通过本体评价进行评估后,采用OWL-DL描述为领域本体。通过实验验证了本算法的有效性。

关键词: 领域本体, 自动构建, 知识管理, 信息管理, 文档概念图, 频繁子图挖掘, 信息子图, 算法

Abstract: Aiming at the shortage of domain ontology component algorithm, a graph-based approach for automatic construction of domain ontology oriented to knowledge and information management was proposed in which concept extraction and relationship extraction were included. Each document in the collection was mapped as a document graph. Random walk term weighting was employed to estimate the importance of the term information to the corpus from both local and global perspectives. Graph vertex clustering algorithm was used to classify terms with different meanings and group similar terms to generate candidate concepts. An improved frequent sub-graph mining algorithm constrained by both vertices and information was proposed to find arbitrary latent relationships among these concepts. For ontology evaluation purpose, a method for adaptive adjustment of concepts and relationship with respect to its practical effectiveness was brought forward. Finally the domain ontology was formed by describing domain concepts and relationships using OWL-DL. Evaluation experiments showed the effectiveness of this algorithm.

Key words: domain ontology, automatic construction, knowledge management, document graph, frequent sub-graph mining, informative sub-graph, algorithm

中图分类号: