A novel trace-based sampling method for conformance checking

View article
PeerJ Computer Science

Main article text

 

Introduction

  • –Wrong recording of activity executions.

  • –Interruption of the process execution.

  • –Corruption in the recorded event data.

  • –Technical problems with the information systems.

  • –Poor data quality (missing, erroneous or noisy values, duplicates, etc.).

  • –Synchronization problems.

  • –Decisions taken that violate some organizational internal rules or external regulations.

  • –Lack of coordination among the actors involved in the process.

  • –A novel data dispersion metric applied in the PM context, particularly in event log data for the conformance checking task.

  • –A novel conformance checking method using a traces selection mechanism and an approximation algorithm for computing the conformance value.

Preliminaries

Process mining

Conformance checking

  • –Is the process being executed as it is documented in a model?

  • –Is the model of a process still up-to-date?

  • –Is compliance with standards and regulations maintained in the execution of process activities?

  • –What is the level of flexibility allowed in the execution of the process?

  • Synchronous move (the move in both the log and the model): ej in any ti trace can be mapped to the occurrence of an enabled activity ak in M. This move can be expressed as the mapping ( ej, ak).

  • Move on model ( moveM): The occurrence of an enabled activity ak in M cannot be mapped to any event given the flow implied by a trace ti. In this case, the behavior in M would not be observed in the trace. This move is denoted as a mapping (,ak). Here, the symbol denotes the absence of event ej that should correspond to ak in M.

  • Move on log ( moveL): Opposite to the previous case, an event ej in a trace ti cannot be mapped to any enabled activity in M. This move is expressed by the mapping (ei,). Now, denotes the absence of activity ak in M that should correspond to ej in ti.

  • Illegal move ( moveI): This move occurs when ej belonging to ti cannot be mapped in any way to an activity ak in M.

  • –A synchronous move has cost 0: δS(ej,ak)=0.

  • –A log move has cost 1: δS(ej,>>)=1.

  • –A model move from a visible task has cost 1: δS(>>,ak)=1.

  • –A model move from an invisible task (task without label) has cost 0: δS(>>,ak)=0.

  • –An illegal move always has cost 0.

Novel conformance checking method

Analysis of the event log

Representative sample of traces

Sample traces by random sampling

Using the full event log (no sampling)

Conformance metric

Experimental evaluation and results

Data

Setup

Experiment 1. Calculation of the event log dispersion level

Experiment 2: performance evaluation

Experiment 3: measuring the conformance value

Conclusions

Supplemental Information

Event log sepsis Cases.

The data associated to each event in the log used for experimentation.

DOI: 10.7717/peerj-cs.2601/supp-1

Event log hospital billing.

The log of events related to the billing of medical services that have been provided by a hospital.

DOI: 10.7717/peerj-cs.2601/supp-2

Artificia data consisting of 200 cases selected randomly from the sepsis cases event log.

DOI: 10.7717/peerj-cs.2601/supp-3

Artificia data consisting of 110 cases selected randomly from the sepsis cases event log.

DOI: 10.7717/peerj-cs.2601/supp-4

Python source code that implements the proposed method for conformance checking and serves as the medium to obtain the metrics reported in the paper.

DOI: 10.7717/peerj-cs.2601/supp-5

Additional Information and Declarations

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Heidy M. Marin-Castro conceived and designed the experiments, performed the experiments, analyzed the data, performed the computation work, authored or reviewed drafts of the article, and approved the final draft.

Miguel Morales-Sandoval conceived and designed the experiments, performed the experiments, analyzed the data, authored or reviewed drafts of the article, and approved the final draft.

José Luis González-Compean performed the computation work, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Julio Hernandez analyzed the data, performed the computation work, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Data Availability

The following information was supplied regarding data availability:

The raw data are available in the Supplemental Files.

Funding

The authors received no funding for this work.

336 Visitors 379 Views 4 Downloads

Your institution may have Open Access funds available for qualifying authors. See if you qualify

Publish for free

Comment on Articles or Preprints and we'll waive your author fee
Learn more

Five new journals in Chemistry

Free to publish • Peer-reviewed • From PeerJ
Find out more