Computer Integrated Manufacturing System ›› 2024, Vol. 30 ›› Issue (8): 2797-2808.DOI: 10.13196/j.cims.2023.BPM11

Previous Articles     Next Articles

Batch repair of event logs based on constrained trace clustering

TIAN Yinhua1,LI Xinran1,WU Yuhao1,HAN Dong2+,DU Yuyue3,WANG Lu3   

  1. 1.College of Intelligent Equipment,Shandong University of Science and Technology
    2.College of Continuing Education,Shandong University of Science and Technology
    3.College of Computer Science and Engineering,Shandong University of Science and Technology
  • Online:2024-08-31 Published:2024-09-05
  • Supported by:
    Project supported by the National Natural Science Foundation,China(No.72101137),the  Humanities and Social Science Research Youth Fund of Ministry of Education,China (No.21YJCZH150,20YJCZH159),the Natural Science Foundation of Shandong Province,China(No.ZR2021MF117,ZR2022QF020),the Key R&D Program(Soft Science)of Shandong Province,China(No.2022RKY02009),and the Shandong Digital Economy Research Base Project of Research Center of Shandong Province on “Xi Jinping Thought on Socialism with Chinese Characteristics for a New Era” and Shandong University of Science and Technology,China(No.SDSZJD202314).

基于约束轨迹聚类的事件日志批量修复方法

田银花1,李昕燃1,武于皓1,韩咚2+,杜玉越3,王路3   

  1. 1.山东科技大学智能装备学院
    2.山东科技大学继续教育学院
    3.山东科技大学计算机科学与工程学院
  • 作者简介:
    田银花(1982-),女,山东肥城人,副教授,博士,硕士生导师,研究方向:大数据分析、Petri网、过程挖掘等,E-mail:skdxxtyh@163.com;

    李昕燃(1995-),男,江苏徐州人,硕士研究生,研究方向:Petri网和业务流程管理,E-mail:15295782635@163.com;

    武于皓(1998-),男,山东菏泽人,硕士研究生,研究方向:业务流程管理、机器学习,E-mail:skdwuyuhao@163.com;

    +韩咚(1982-),男,山东泰安人,讲师,博士研究生,研究方向:Petri网、过程挖掘、机器学习,通讯作者,E-mail:aa1130_2011@163.com;

    杜玉越(1960-),男,山东聊城人,教授,博士,博士生导师,研究方向:软件工程、形式化技术、Petri网等,E-mail:yydu001@163.com;

    王路(1989-),女,山东泰安人,副教授,博士,硕士生导师,研究方向:过程挖掘、业务过程管理、工作流等,E-mail:wanglu253@126.com。
  • 基金资助:
    国家自然科学基金资助项目(72101137);教育部人文社会科学研究青年基金资助项目(21YJCZH150,20YJCZH159);山东省自然科学基金资助项目(ZR2021MF117,ZR2022QF020);山东省重点研发计划(软科学)资助项目(2022RKY02009);山东省习近平新时代中国特色社会主义思想研究中心山东科技大学山东数字经济研究基地资助项目(SDSZJD202314)。

Abstract: A large amount of event logs are generated during the operation of the enterprise business,which are the foundation and guarantee for the mining,monitoring and optimization of business process.However,original event logs are so less structured and more flexible that it is difficult to apply them to process mining directly.Hence,it is imperative to repair event logs.The existing log repair methods need to align the traces one by one with the process model,and different kinds of deviation behaviors should be repaired using different means,which lead to low repair efficiency and weak applicability.To resolve the above-mentioned problems,a batch log repair method based on constrained trace clustering was proposed which combined trace clustering methods and text similarity metrics.By imposing constraints on each procedure of trace clustering,one single cluster included the fitting trace as the cluster center and the unfitting traces similar to the fitting trace,and the central trace was considered as the repair result.This method could not only directly obtain the repaired fitting traces without analyzing the deviations,but also realized the batch repair of the unfitting traces.Experiment results showed that the proposed method could filter the noise and then repair the event logs in batch,without process models and ensuring high repair accuracy.

Key words: trace clustering, text similarity, log repair, event log, noise filtering

摘要: 企业业务运行过程中会产生大量的事件日志,事件日志是业务过程挖掘、监控和优化的基础和保障。然而,原始的事件日志由于缺乏结构及过于灵活导致难以直接应用于过程挖掘,对事件日志进行修复势在必行。现有日志修复方法需要结合过程模型逐条检查轨迹,并对各类异常行为采用不同策略进行修复,导致修复效率低下、适用性不强。针对上述问题,利用轨迹聚类方法,结合文本相似度指标,提出一种基于约束轨迹聚类的批量日志修复方法。该方法通过对轨迹聚类的每个步骤施加约束条件,使得单个簇包含作为簇中心的拟合轨迹以及与该拟合轨迹相似的异常轨迹,且中心轨迹即为异常轨迹的修复结果。该方法不但无需分析异常行为,直接获得修复后的拟合轨迹,而且实现了对于异常轨迹的批量修复。实验表明,该方法在脱离过程模型并保证高修复准确率的前提下,能够在噪音过滤之后,有效且高效地对事件日志进行批量修复。

关键词: 轨迹聚类, 文本相似度, 日志修复, 事件日志, 噪音过滤

CLC Number: