Computer Integrated Manufacturing System ›› 2025, Vol. 31 ›› Issue (5): 1713-1720.DOI: 10.13196/j.cims.2024.BPM02

Previous Articles     Next Articles

Uncertain traces probability calculation and log repair study

HUANG Hongkai,YE Jianhong,WU Yongjin   

  1. School of Computer Science and Technology,Huaqiao University
  • Online:2025-05-31 Published:2025-06-06
  • Supported by:
    Project supported by the National Natural Science Foundation,China(No.61973130).

不确定轨迹概率计算及日志修复研究

黄鸿楷,叶剑虹,吴永进   

  1. 华侨大学计算机科学与技术学院
  • 作者简介:
    黄鸿楷(1997-),男,福建福州人,硕士研究生,研究方向:基于业务流程事件日志的信息挖掘、日志修复、一致性检验,E-mail:hongkaihuang@stu.hqu.edu.cn;

    叶剑虹(1976-),男,福建厦门人,副教授,博士,硕士生导师,研究方向:AI增强的流程挖掘、实时流程监控、大模型的可解释性,E-mail:leafever@163.com。

    吴永进(1997-),男,福建莆田人,硕士研究生,研究方向:流程挖掘,E-mail:617359810@qq.com。
  • 基金资助:
    国家自然科学基金资助项目(61973130)。

Abstract: The potential gaps in electronic systems and manual recording processes often lead to uncertainties in the recorded procedural data,resulting in an indeterminate sequence of events.The time attribute in logs exhibits diverse formats,ranging from various levels of granularity such as year,month,day,hour,minute and second.Additionally,time values can be expressed as single points or intervals.Due to overlapping or encompassing time attributes,it becomes challenging to accurately represent the chronological relationships among events with uncertain time data.Focusing on the temporal information within event logs,a numerical integration approach was employed based on the interrelations of time values within logs to transform the inherently uncertain time and sequencing information into a deterministic format.Consequently,all potential activity traces within the logs along with their corresponding occurrence frequencies were yielded.To enhance the computational speed of this method,a time-based overlap event partitioning approach was introduced.Log repair is a critical component of preprocessing for process data,thus a log repair method based on uncertain time data was further proposed.The computed activity sequences and their frequency information were applied to two frequency-based log repair methods,and the effectiveness of the repair was evaluated by the proportion of successfully repaired sequences to the total sequences.

Key words: process data, uncertainty, activity trace, log repair

摘要: 电子系统和人工记录等环节可能存在的疏漏使得所记录的流程数据中常常存在顺序不确定性。日志的时间属性的格式多样,包括年月日时分秒等不同粗细粒度的表示方法,并且时间取值可能是一个点或一个范围。由于时间属性的相交或包含,不确定的时间数据无法准确地表达事件之间的发生顺序关系。以事件日志中时间信息为研究点,通过基于日志中事件时间值的相互关系的数值积分方法,将原本不确定的时间和顺序信息确定化,最终能得到日志中的所有可能的活动轨迹及其对应的发生频率。为加速这一方法的计算速度,提出一种基于时间的相交事件划分方法。日志修复是流程数据预处理的重要一环,因此进一步提出一种基于不确定时间数据的日志修复方法,将计算出来的活动轨迹及其对应频率信息应用于两种基于频率的日志修复方法,修复的效果将通过成功修复的轨迹数占全部轨迹中的比例来评估其准确性。

关键词: 流程数据, 不确定性, 活动轨迹, 日志修复

CLC Number: