1. Home
  2. Docs
  3. Stage 5 – Explore
  4. The Process explorations
  5. Process Outliers management

Process Outliers management

Outlier detection and management (remediation) is always a tricky part in any analytics project. The purpose here is to figure these processes out which have a totally different behavior than the others and can alter in a certain way the process analysis.

In statistics (or data management) an outlier is a data point that is extremely high or extremely low relative to the nearest data point and the rest of neighboring coexisting values in a data chart or data set you are working with. For a Process Mining standpoint a Process outlier is a process instance which have a behavior really far from the others and so can be considered as noise.


Before doing anything on these outliers (like removing these ones) it’s better to isolate them so as to assess their noise status with the business analyst: this is the time for the outlier detection. There are different ways to detect Process outliers and we may analyze the process in several facets to detect them. Unfortunatly there are no laws or strict rules to detect these outliers, however it’s good practice to look at least at these informations and check if we can find out something that can be wrong:

  • Process Duration: Several Process instances takes too long and may be a an old instance of an previous version of the Process for example. It can also be an uncomplete process. In this case is a good practice to work on the 90-percentile of the dataset just to extract from the dataset the extreme values (the Process which takes too long or at the opposite are performed to fast !). In the BPPI screenshot below, we just have to select the 90 percentile option (in the middle) and filter out the values in the middle if needed:
  • Process Cost: Some instances are really too expensives or at the opposite too cheap and may need more investigation. Into BPPI the 90 percentile works exactly in the same way (as above) as there is a Cost dedicated view for that. So the Process Analyst can select the Process with very low and very high costs to put them appart in several clicks:
  • Process variation which is totally unexpected or just occurs less than 0,01% of the time for example. Verifying these processes are outliers or not can be at this point quite tedious. In the screenshot below BPPI shows the less used path in a simple view, so it’s up to the Process Analyst to select the ones considered as outlier to put them apart:
  • A very high – or very low – number of events for some process instances.
  • A number of repetition too high for the same Process Step (with no business explanations)
  • Any other data (dimensions) which had been ingested (optional data) which has some weird values and does not really makes sense for a Process analysis prospective.

Remediation … or not !

As we said before outliers can be considered as noise and our goal is therefore to remove these outliers from the process of analysis as they may affect future investigations of the process even more badly. So as we have now assesed their outlier status with the business analysit (Process Analyst) it’s no time to decide what to do with these processes instances.

In general there are 2 options:

  • These outliers are due to bad data and we can remediate the data at the source. That means we’ll need to re-iterate and maybe go back to the Data Collection Plan (DCP).
  • We simply remove the outliers from the Process Analysis.


As we are analysing the Process behaviors we always have to keep in mind the considered outliers may also hint real process execution (but execution that should never happen). This is why this outlier analysis can be tricky and should not be under-estimated. Detecting these potential outliers and having a closer look at these unusual occurrences with the business analyst is key and can can also spot and tackle weak spots or just drive to the conclusion these process/data are just noise !

Was this article helpful to you? Yes No

How can we help?