Computer Integrated Manufacturing System ›› 2022, Vol. 28 ›› Issue (10): 3156-3165.DOI: 10.13196/j.cims.2022.10.012

Previous Articles     Next Articles

Event log sampling approach towards log completeness

SU Xuan1,LIU Cong1+,ZHANG Shuaipeng1,ZENG Qingtian2,LI Caihong1   

  1. 1.School of Computer Science and Technology,Shandong University of Technology
    2.College of Computer Science and Engineering,Shandong University of Science and Technology
  • Online:2022-10-31 Published:2022-11-10
  • Supported by:
    Project supported by the National Natural Science Foundation,China(No.61902222),the Taishan Scholars Program of Shandong Province,China(No.tsqn201909109),the Natural Science Excellent Youth Foundation of Shandong Province,China(No.ZR2021YQ45),and the Youth Innovation Science and Technology Team Foundation of Shandong Provincial Universities,China(No.2021KJ031).

面向日志完备性的事件日志采样方法

苏轩1,刘聪1+,张帅鹏1,曾庆田2,李彩虹1   

  1. 1.山东理工大学计算机科学与技术学院
    2.山东科技大学计算机科学与工程学院
  • 基金资助:
    国家自然科学基金资助项目(61902222);山东省泰山学者工程专项基金资助项目(tsqn201909109);山东省自然科学基金优秀青年基金资助项目(ZR2021YQ45);山东省高等学校青创科技计划创新团队资助项目(2021KJ031)。

Abstract: The event log sampling method can improve the efficiency of model discovery.The existing sampling methods still have the problem of low efficiency and cannot guarantee the model quality when dealing with large-scale event logs.Therefore,an event log sampling approach oriented log completeness was proposed,which included brute force sampling,set coverage sampling,trace length-based sampling and trace frequency-based sampling.The proposed sampling approaches had been implemented in the open-source process mining toolkit ProM.Furthermore,experiments using 9 public event log datasets from both time performance analysis and model quality evaluation showed that the proposed sampling approaches could greatly improve the efficiency of log sampling on the premise of ensuring the quality of model mining.

Key words: event logs, log sampling, quality measure, model discovery, log completeness

摘要: 针对已有采样方法在处理大规模事件日志时仍存在效率低下且无法保证模型质量的问题,提出面向日志完备性的事件日志采样方法,包括完全遍历采样法、集合覆盖采样法、基于轨迹长度的采样方法和基于轨迹频次的采样方法,并在开源流程挖掘工具平台ProM中实现。采用9个公开事件日志数据集从时间性能分析和模型质量评估两方面实验表明,所提采样方法在保证模型挖掘质量的前提下能够大幅提高日志采样效率。

关键词: 事件日志, 日志采样, 质量评估, 模型发现, 日志完备性

CLC Number: